import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
df = pd.read_csv(r'C:\Users\User\Downloads\Organizational_Hierarchy_Cleaned.csv')
G = nx.DiGraph()
for _, row in df.iterrows():
if pd.notna(row['all_manager']):
G.add_edge(row['all_manager'], row['all_name'])
def hierarchy_pos(G, root=None, width=1., vert_gap=0.4, vert_loc=0, xcenter=0.5):
pos = {}
def _hierarchy_pos(G, root, left, right, vert_loc, xcenter, pos):
pos[root] = (xcenter, vert_loc)
children = list(G.successors(root))
if len(children) != 0:
dx = (right - left) / len(children)
nextx = left + dx / 2
for child in children:
pos = _hierarchy_pos(G, child, nextx - dx/2, nextx + dx/2, vert_loc - vert_gap, nextx, pos)
nextx += dx
return pos
if root is None:
root = [n for n, d in G.in_degree() if d == 0][0]
return _hierarchy_pos(G, root, 0, width, vert_loc, xcenter, pos)
# Map titles to colors
name_to_title = pd.Series(df.Title.values, index=df.all_name).to_dict()
title_color_map = {
'CEO': 'gold',
'President': 'orange',
'Sr. MD': 'lightblue',
'MD': 'lightgreen',
'Dir.': 'skyblue',
'VP': 'violet',
'AS': 'lightpink',
'AN': 'lightgray'
}
node_colors = []
for node in G.nodes():
title = name_to_title.get(node, "")
for key in title_color_map:
if key in title:
node_colors.append(title_color_map[key])
break
else:
node_colors.append('white')
# Plot the graph with legend
plt.figure(figsize=(18, 14))
pos = hierarchy_pos(G, vert_gap=0.4)
nx.draw(
G, pos,
with_labels=False,
arrows=True,
node_size=3500,
node_color=node_colors,
edgecolors='black'
)
nx.draw_networkx_labels(
G, pos,
font_size=15,
bbox=dict(facecolor='white', edgecolor='none', pad=1.0)
)
# Add legend
legend_patches = [mpatches.Patch(color=color, label=title) for title, color in title_color_map.items()]
plt.legend(
handles=legend_patches,
title='Titles',
loc='lower center',
bbox_to_anchor=(0.5, -0.1),
ncol=4,
fontsize=14,
title_fontsize=18,
frameon=True
)
plt.title('Backdoor Strategies, Inc.', size=38)
plt.tight_layout()
plt.show()
<ipython-input-3-7d254f2a78fe>:72: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. plt.tight_layout()
President Riley is connected to 4 people
G.degree('7-President-Riley')
4
He reports into 1 person (has 1 manager)
G.in_degree('7-President-Riley')
1
And 3 person reports into him (has 3 direct reports)
G.out_degree('7-President-Riley')
3
Viewing all 4 of President Riley connection as 'edges'
G.edges('7-President-Riley')
OutEdgeDataView([('7-President-Riley', '6-Sr. MD-Alex'), ('7-President-Riley', '6-Sr. MD-Mo'), ('7-President-Riley', '6-Sr. MD-Jon')])
Our humble Analyst Donny only has 1 connection
G.degree('1-AN-Donny')
1
Top 10 members with most # of connections
df['degree of connection'] = df['all_name'].apply(lambda name: G.degree(name))
#df['in_degree'] = df['all_name'].apply(lambda name: G.in_degree(name))
#df['out_degree'] = df['all_name'].apply(lambda name: G.out_degree(name))
df.sort_values(by='degree of connection', ascending=False)[:10]
| Title | all_name | all_manager | degree of connection | |
|---|---|---|---|---|
| 2 | Sr. MD | 6-Sr. MD-Alex | 7-President-Riley | 4 |
| 3 | Sr. MD | 6-Sr. MD-Mo | 7-President-Riley | 4 |
| 4 | Sr. MD | 6-Sr. MD-Jon | 7-President-Riley | 4 |
| 1 | President | 7-President-Riley | 8-CEO-Pete | 4 |
| 7 | MD | 5-MD-Dan | 6-Sr. MD-Mo | 3 |
| 8 | MD | 5-MD-Jean | 6-Sr. MD-Jon | 3 |
| 9 | MD | 5-MD-Tim | 6-Sr. MD-Alex | 3 |
| 10 | MD | 5-MD-Bob | 6-Sr. MD-Jon | 3 |
| 16 | Dir. | 4-Dir.-Fred | 5-MD-Jean | 2 |
| 15 | Dir. | 4-Dir.-Kim | 5-MD-Dan | 2 |
What it measures: The number of direct connections a node has — i.e., how many people someone directly manages (out-degree) or reports to (in-degree), normalized by total possible.
In context:
High scorers like President Riley and Sr. MDs have high degree centrality because they directly manage multiple individuals.
CEO Pete, despite being the top, has only one direct report (the President), so his degree centrality is relatively low.
# calculate degree centrality
degree_centrality = nx.degree_centrality(G)
dc = pd.DataFrame(degree_centrality.items(), columns=['all_name', 'degree_centrality score'])
# map to dataframe
df['degree_centrality'] = df['all_name'].map(degree_centrality)
dc.sort_values(by='degree_centrality score', ascending=False)
| all_name | degree_centrality score | |
|---|---|---|
| 2 | 6-Sr. MD-Alex | 0.12500 |
| 3 | 6-Sr. MD-Mo | 0.12500 |
| 4 | 6-Sr. MD-Jon | 0.12500 |
| 1 | 7-President-Riley | 0.12500 |
| 7 | 5-MD-Dan | 0.09375 |
| 8 | 5-MD-Jean | 0.09375 |
| 9 | 5-MD-Tim | 0.09375 |
| 10 | 5-MD-Bob | 0.09375 |
| 16 | 4-Dir.-Fred | 0.06250 |
| 15 | 4-Dir.-Kim | 0.06250 |
| 22 | 3-VP-Glen | 0.06250 |
| 21 | 3-VP-Casey | 0.06250 |
| 20 | 4-Dir.-Cass | 0.06250 |
| 19 | 4-Dir.-Bill | 0.06250 |
| 18 | 4-Dir.-Ted | 0.06250 |
| 26 | 2-AS-Len | 0.06250 |
| 13 | 4-Dir.-Ed | 0.06250 |
| 6 | 5-MD-Jordan | 0.06250 |
| 5 | 5-MD-Alan | 0.06250 |
| 29 | 1-AN-Ken | 0.03125 |
| 28 | 2-AS-Joe | 0.03125 |
| 25 | 3-VP-Jim | 0.03125 |
| 31 | 1-AN-Vic | 0.03125 |
| 27 | 2-AS-Stan | 0.03125 |
| 30 | 1-AN-Donny | 0.03125 |
| 0 | 8-CEO-Pete | 0.03125 |
| 24 | 3-VP-Ian | 0.03125 |
| 23 | 3-VP-Ren | 0.03125 |
| 17 | 4-Dir.-Drew | 0.03125 |
| 14 | 4-Dir.-Betty | 0.03125 |
| 12 | 4-Dir.-Pat | 0.03125 |
| 11 | 4-Dir.-Ren | 0.03125 |
| 32 | 1-AN-Nate | 0.03125 |
What it measures: Closeness centrality calculates how close a node is to all other nodes in the network — based on the total number of steps it takes to reach everyone else. A lower total distance means a higher centrality score.
Why use an undirected graph: In a strict hierarchy (like a corporate org chart), the graph is usually directed — managers point to reports. But for closeness centrality, using a directed graph can be misleading, since many nodes (like the CEO) can't reach "down" the hierarchy due to one-way edges. By converting the graph to undirected, we treat all relationships as mutual for the purpose of measuring proximity — giving a more realistic view of who sits at the “center” of the org, structurally.
In context:
President Riley ranks highest in closeness centrality, meaning he’s, on average, fewer steps away from everyone else in the company. This makes sense — he’s just below the CEO and above a wide swath of senior and mid-level staff.
CEO Pete ranks second, which also aligns intuitively: he connects to the entire org but from the top down.
This measure highlights structural proximity, not power — Riley isn’t “more important” than Pete, but he’s more centrally located in the company’s human web.
# change graph to undirected
G = nx.Graph()
for _, row in df.iterrows():
if pd.notna(row['all_manager']):
G.add_edge(row['all_manager'], row['all_name'])
# calculate closeness centrality
closeness_centrality = nx.closeness_centrality(G)
cc = pd.DataFrame(closeness_centrality.items(), columns=['all_name', 'closeness_centrality score'])
# map to dataframe
df['closeness_centrality'] = df['all_name'].map(closeness_centrality)
cc.sort_values(by='closeness_centrality score', ascending=False)
| all_name | closeness_centrality score | |
|---|---|---|
| 1 | 7-President-Riley | 0.347826 |
| 4 | 6-Sr. MD-Jon | 0.329897 |
| 2 | 6-Sr. MD-Alex | 0.304762 |
| 3 | 6-Sr. MD-Mo | 0.288288 |
| 8 | 5-MD-Jean | 0.271186 |
| 10 | 5-MD-Bob | 0.271186 |
| 0 | 8-CEO-Pete | 0.260163 |
| 31 | 1-AN-Vic | 0.250000 |
| 5 | 5-MD-Alan | 0.242424 |
| 6 | 5-MD-Jordan | 0.242424 |
| 9 | 5-MD-Tim | 0.242424 |
| 7 | 5-MD-Dan | 0.235294 |
| 29 | 1-AN-Ken | 0.225352 |
| 32 | 1-AN-Nate | 0.225352 |
| 19 | 4-Dir.-Bill | 0.223776 |
| 16 | 4-Dir.-Fred | 0.220690 |
| 18 | 4-Dir.-Ted | 0.217687 |
| 17 | 4-Dir.-Drew | 0.214765 |
| 13 | 4-Dir.-Ed | 0.198758 |
| 20 | 4-Dir.-Cass | 0.198758 |
| 11 | 4-Dir.-Ren | 0.196319 |
| 14 | 4-Dir.-Betty | 0.196319 |
| 15 | 4-Dir.-Kim | 0.193939 |
| 12 | 4-Dir.-Pat | 0.191617 |
| 21 | 3-VP-Casey | 0.188235 |
| 22 | 3-VP-Glen | 0.183908 |
| 25 | 3-VP-Jim | 0.179775 |
| 24 | 3-VP-Ian | 0.166667 |
| 23 | 3-VP-Ren | 0.166667 |
| 28 | 2-AS-Joe | 0.163265 |
| 26 | 2-AS-Len | 0.160804 |
| 27 | 2-AS-Stan | 0.156098 |
| 30 | 1-AN-Donny | 0.139130 |
What it measures: How often a node lies on the shortest path between other nodes — a good proxy for gatekeeping or brokerage power.
In context:
President Riley and Sr. MDs rank highly because they sit between upper and lower levels of the hierarchy — controlling information or authority flow.
CEO Pete has zero betweenness because there are no paths that pass through him; he's only at the starting point.
# calculate betweenness centrality
betweenness_centrality = nx.betweenness_centrality(G)
bc = pd.DataFrame(betweenness_centrality.items(), columns=['all_name', 'betweenness_centrality score'])
# map to df
df['betweenness_centrality'] = df['all_name'].map(betweenness_centrality)
# show the DataFrame
bc.sort_values(by='betweenness_centrality score', ascending=False)
| all_name | betweenness_centrality score | |
|---|---|---|
| 1 | 7-President-Riley | 0.683468 |
| 4 | 6-Sr. MD-Jon | 0.594758 |
| 2 | 6-Sr. MD-Alex | 0.471774 |
| 3 | 6-Sr. MD-Mo | 0.332661 |
| 8 | 5-MD-Jean | 0.284274 |
| 10 | 5-MD-Bob | 0.280242 |
| 7 | 5-MD-Dan | 0.179435 |
| 19 | 4-Dir.-Bill | 0.175403 |
| 9 | 5-MD-Tim | 0.122984 |
| 16 | 4-Dir.-Fred | 0.120968 |
| 5 | 5-MD-Alan | 0.120968 |
| 6 | 5-MD-Jordan | 0.120968 |
| 21 | 3-VP-Casey | 0.120968 |
| 22 | 3-VP-Glen | 0.062500 |
| 20 | 4-Dir.-Cass | 0.062500 |
| 18 | 4-Dir.-Ted | 0.062500 |
| 26 | 2-AS-Len | 0.062500 |
| 15 | 4-Dir.-Kim | 0.062500 |
| 13 | 4-Dir.-Ed | 0.062500 |
| 28 | 2-AS-Joe | 0.000000 |
| 29 | 1-AN-Ken | 0.000000 |
| 25 | 3-VP-Jim | 0.000000 |
| 31 | 1-AN-Vic | 0.000000 |
| 27 | 2-AS-Stan | 0.000000 |
| 30 | 1-AN-Donny | 0.000000 |
| 0 | 8-CEO-Pete | 0.000000 |
| 24 | 3-VP-Ian | 0.000000 |
| 23 | 3-VP-Ren | 0.000000 |
| 17 | 4-Dir.-Drew | 0.000000 |
| 14 | 4-Dir.-Betty | 0.000000 |
| 12 | 4-Dir.-Pat | 0.000000 |
| 11 | 4-Dir.-Ren | 0.000000 |
| 32 | 1-AN-Nate | 0.000000 |
What it measures: A node’s influence based on who they’re connected to — it gives more weight to connections with already-important people. In short, you're important if you're connected to other important people.
In context:
CEO Pete ranks the highest because, after reversing the graph (so influence flows upward), everyone in the organization is connected to someone who ultimately reports to him.
This aligns with intuition — since the CEO sits at the top, all paths of influence lead to him, making him the most central figure in the network.
#convert to directional graph to account for reporting structure/influence
G = nx.DiGraph()
for _, row in df.iterrows():
if pd.notna(row['all_manager']):
G.add_edge(row['all_manager'], row['all_name'])
# calculate eigenvector centrality
eigenvector_centrality = nx.eigenvector_centrality(G.reverse(), max_iter=1000)
ec = pd.DataFrame(eigenvector_centrality.items(), columns=['all_name', 'eigenvector_centrality score'])
# map to df
df['eigenvector_centrality'] = df['all_name'].map(eigenvector_centrality)
# show the DataFrame
ec.sort_values(by='eigenvector_centrality score', ascending=False)
| all_name | eigenvector_centrality score | |
|---|---|---|
| 0 | 8-CEO-Pete | 9.998896e-01 |
| 1 | 7-President-Riley | 1.485631e-02 |
| 4 | 6-Sr. MD-Jon | 1.887582e-04 |
| 10 | 5-MD-Bob | 1.978351e-06 |
| 2 | 6-Sr. MD-Alex | 3.367089e-08 |
| 3 | 6-Sr. MD-Mo | 1.683611e-08 |
| 8 | 5-MD-Jean | 1.683567e-08 |
| 19 | 4-Dir.-Bill | 1.672979e-08 |
| 7 | 5-MD-Dan | 1.063277e-10 |
| 21 | 3-VP-Casey | 1.058828e-10 |
| 16 | 4-Dir.-Fred | 1.058828e-10 |
| 6 | 5-MD-Jordan | 1.058828e-10 |
| 5 | 5-MD-Alan | 1.058828e-10 |
| 9 | 5-MD-Tim | 8.907005e-13 |
| 22 | 3-VP-Glen | 4.458186e-13 |
| 26 | 2-AS-Len | 4.458186e-13 |
| 13 | 4-Dir.-Ed | 4.458186e-13 |
| 15 | 4-Dir.-Kim | 4.458186e-13 |
| 18 | 4-Dir.-Ted | 4.458186e-13 |
| 20 | 4-Dir.-Cass | 4.458186e-13 |
| 31 | 1-AN-Vic | 9.365936e-16 |
| 30 | 1-AN-Donny | 9.365936e-16 |
| 29 | 1-AN-Ken | 9.365936e-16 |
| 28 | 2-AS-Joe | 9.365936e-16 |
| 27 | 2-AS-Stan | 9.365936e-16 |
| 11 | 4-Dir.-Ren | 9.365936e-16 |
| 25 | 3-VP-Jim | 9.365936e-16 |
| 24 | 3-VP-Ian | 9.365936e-16 |
| 23 | 3-VP-Ren | 9.365936e-16 |
| 12 | 4-Dir.-Pat | 9.365936e-16 |
| 17 | 4-Dir.-Drew | 9.365936e-16 |
| 14 | 4-Dir.-Betty | 9.365936e-16 |
| 32 | 1-AN-Nate | 9.365936e-16 |
df['avg_of_centrality'] = df[[
'degree_centrality',
'closeness_centrality',
'betweenness_centrality',
'eigenvector_centrality'
]].mean(axis=1)
df.sort_values(by='avg_of_centrality', ascending=False)
| Title | all_name | all_manager | degree of connection | degree_centrality | closeness_centrality | betweenness_centrality | eigenvector_centrality | avg_of_centrality | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | CEO | 8-CEO-Pete | NaN | 1 | 0.03125 | 0.260163 | 0.000000 | 9.998896e-01 | 0.322826 |
| 1 | President | 7-President-Riley | 8-CEO-Pete | 4 | 0.12500 | 0.347826 | 0.683468 | 1.485631e-02 | 0.292788 |
| 4 | Sr. MD | 6-Sr. MD-Jon | 7-President-Riley | 4 | 0.12500 | 0.329897 | 0.594758 | 1.887582e-04 | 0.262461 |
| 2 | Sr. MD | 6-Sr. MD-Alex | 7-President-Riley | 4 | 0.12500 | 0.304762 | 0.471774 | 3.367089e-08 | 0.225384 |
| 3 | Sr. MD | 6-Sr. MD-Mo | 7-President-Riley | 4 | 0.12500 | 0.288288 | 0.332661 | 1.683611e-08 | 0.186487 |
| 8 | MD | 5-MD-Jean | 6-Sr. MD-Jon | 3 | 0.09375 | 0.271186 | 0.284274 | 1.683567e-08 | 0.162303 |
| 10 | MD | 5-MD-Bob | 6-Sr. MD-Jon | 3 | 0.09375 | 0.271186 | 0.280242 | 1.978351e-06 | 0.161295 |
| 7 | MD | 5-MD-Dan | 6-Sr. MD-Mo | 3 | 0.09375 | 0.235294 | 0.179435 | 1.063277e-10 | 0.127120 |
| 19 | Dir. | 4-Dir.-Bill | 5-MD-Bob | 2 | 0.06250 | 0.223776 | 0.175403 | 1.672979e-08 | 0.115420 |
| 9 | MD | 5-MD-Tim | 6-Sr. MD-Alex | 3 | 0.09375 | 0.242424 | 0.122984 | 8.907005e-13 | 0.114790 |
| 5 | MD | 5-MD-Alan | 6-Sr. MD-Alex | 2 | 0.06250 | 0.242424 | 0.120968 | 1.058828e-10 | 0.106473 |
| 6 | MD | 5-MD-Jordan | 6-Sr. MD-Alex | 2 | 0.06250 | 0.242424 | 0.120968 | 1.058828e-10 | 0.106473 |
| 16 | Dir. | 4-Dir.-Fred | 5-MD-Jean | 2 | 0.06250 | 0.220690 | 0.120968 | 1.058828e-10 | 0.101039 |
| 21 | VP | 3-VP-Casey | 4-Dir.-Bill | 2 | 0.06250 | 0.188235 | 0.120968 | 1.058828e-10 | 0.092926 |
| 18 | Dir. | 4-Dir.-Ted | 5-MD-Jean | 2 | 0.06250 | 0.217687 | 0.062500 | 4.458186e-13 | 0.085672 |
| 13 | Dir. | 4-Dir.-Ed | 5-MD-Alan | 2 | 0.06250 | 0.198758 | 0.062500 | 4.458186e-13 | 0.080939 |
| 20 | Dir. | 4-Dir.-Cass | 5-MD-Jordan | 2 | 0.06250 | 0.198758 | 0.062500 | 4.458186e-13 | 0.080939 |
| 15 | Dir. | 4-Dir.-Kim | 5-MD-Dan | 2 | 0.06250 | 0.193939 | 0.062500 | 4.458186e-13 | 0.079735 |
| 22 | VP | 3-VP-Glen | 4-Dir.-Fred | 2 | 0.06250 | 0.183908 | 0.062500 | 4.458186e-13 | 0.077227 |
| 26 | AS | 2-AS-Len | 3-VP-Casey | 2 | 0.06250 | 0.160804 | 0.062500 | 4.458186e-13 | 0.071451 |
| 31 | AN | 1-AN-Vic | 6-Sr. MD-Jon | 1 | 0.03125 | 0.250000 | 0.000000 | 9.365936e-16 | 0.070313 |
| 29 | AN | 1-AN-Ken | 6-Sr. MD-Mo | 1 | 0.03125 | 0.225352 | 0.000000 | 9.365936e-16 | 0.064151 |
| 32 | AN | 1-AN-Nate | 6-Sr. MD-Mo | 1 | 0.03125 | 0.225352 | 0.000000 | 9.365936e-16 | 0.064151 |
| 17 | Dir. | 4-Dir.-Drew | 5-MD-Bob | 1 | 0.03125 | 0.214765 | 0.000000 | 9.365936e-16 | 0.061504 |
| 14 | Dir. | 4-Dir.-Betty | 5-MD-Tim | 1 | 0.03125 | 0.196319 | 0.000000 | 9.365936e-16 | 0.056892 |
| 11 | Dir. | 4-Dir.-Ren | 5-MD-Tim | 1 | 0.03125 | 0.196319 | 0.000000 | 9.365936e-16 | 0.056892 |
| 12 | Dir. | 4-Dir.-Pat | 5-MD-Dan | 1 | 0.03125 | 0.191617 | 0.000000 | 9.365936e-16 | 0.055717 |
| 25 | VP | 3-VP-Jim | 4-Dir.-Ted | 1 | 0.03125 | 0.179775 | 0.000000 | 9.365936e-16 | 0.052756 |
| 24 | VP | 3-VP-Ian | 4-Dir.-Ed | 1 | 0.03125 | 0.166667 | 0.000000 | 9.365936e-16 | 0.049479 |
| 23 | VP | 3-VP-Ren | 4-Dir.-Cass | 1 | 0.03125 | 0.166667 | 0.000000 | 9.365936e-16 | 0.049479 |
| 28 | AS | 2-AS-Joe | 4-Dir.-Kim | 1 | 0.03125 | 0.163265 | 0.000000 | 9.365936e-16 | 0.048629 |
| 27 | AS | 2-AS-Stan | 3-VP-Glen | 1 | 0.03125 | 0.156098 | 0.000000 | 9.365936e-16 | 0.046837 |
| 30 | AN | 1-AN-Donny | 2-AS-Len | 1 | 0.03125 | 0.139130 | 0.000000 | 9.365936e-16 | 0.042595 |
📌 Degree Centrality
(6-Sr. MD-Alex, 6-Sr. MD-Mo, 6-Sr. MD-Jon, 7-President-Riley) ranks highest in degree centrality, which means they have the most direct connections in the network — they either manage many people or are directly connected to key players.
📌 Closeness Centrality
7-President-Riley ranks highest in closeness centrality, which means they are, on average, the shortest number of steps away from everyone else — they’re structurally central and can “reach” others efficiently.
📌 Betweenness Centrality
7-President-Riley ranks highest in betweenness centrality, which means they frequently sit on the shortest paths between others — they act as a key bridge or gatekeeper in the network.
📌 Eigenvector Centrality
8-CEO-Pete ranks highest in eigenvector centrality, which means they’re not just well-connected, but connected to other important people — their influence is amplified by who they’re linked to.
📌 Expectionally Average (Centrality Score)
8-CEO-Pete and 7-President-Riley has the highest average centrality score, unsurprisingly.
However, outside of the CEO and President circle, 6-Sr. MD-Jon has the highest average centrality score, meaning he consistently rank highly across all measures — he's not just influential in one way, but structurally important, well-connected, and strategically positioned throughout the network.
What will happen if our humble Analyst Donny build a connection to President Riley? Will it boost his centrality score?