Compute centrality scores to measure the importance of entities

Compute centrality scores to determine the importance of an entity in a link chart based on a knowledge graph. Centrality scores provide rankings of the entities depending on their position in the graph depicted by the link chart. The scores identify which link chart entities play a critical role in the link chart. For example, the scores can identify the most influential person in a social network, the events involved in spreading a disease, the critical infrastructure node in an urban network, and so on.

The centrality scores are computed for the content of a specific link chart. The score for each entity depends on its position in the graph relative to the other entities and relationships present on the link chart. If you add content to or remove content from a link chart, the scores change. Similarly, when the same entities are included in another link chart that depicts a different set of relationships, the entities have different scores. Centrality scores for a link chart are presented in the Centrality table view Centrality.

The sections below describe the centrality scores that are calculated, and how to visualize and update these scores as you work with the link chart.

Measures of centrality

For link charts based on a knowledge graph, six different measures of centrality are calculated: Degree, Eigenvector, PageRank, Betweenness, Closeness, and Harmonic. These calculations are all presented in different fields in the link chart's Centrality table.

The Degree, Eigenvector, and PageRank scores are calculated based on the number of relationships in which an entity participates. The Betweenness, Closeness, and Harmonic scores are calculated based on the length of the path from an entity to all the other entities in the link chart.

All relationships in a knowledge graph have an inherent direction in the way they are defined, from the origin entity to the destination entity. However, by default, centrality scores for a link chart are calculated in a manner that ignores a relationship's direction. You can calculate the scores in a manner that accounts for the direction of the relationship, and also with the inherent direction of relationships reversed. An entity's centrality score can be different when relationship directions are considered.

Entity groups are represented in the Centrality table as a single row, and the centrality score is calculated for the group. In contrast, relationships are considered individually even if they are grouped as determined by the Cumulate weights option.

Degree centrality

The Degree centrality score is defined as the total number of link chart relationships in which an entity participates. The Degree score is useful in determining entities that are central to the graph. For example, when the link chart portrays a social network a Person entity with a high Degree score is one that has many connections to other Person entities, and therefore is more central to the network.

When relationship direction is accounted for, the overall Degree score can be broken down to reflect an entity's Indegree and Outdegree. The Indegree score reflects the number of relationships that are incoming—where an entity is at the destination of a relationship. The Outdegree score reflects the number of relationships that are outgoing—where an entity is at the origin of the relationship. When a Person has a high Outdegree score, they may be more crucial with respect to disseminating information. When a Person has a high Indegree score, they are more likely to receive information from different sources.

Recursive relationships occur when an entity is both the origin and the destination of the relationship; these relationships are counted when calculating the Degree of an entity. If an entity participates in only one relationship that is recursive, it has a Degree of two, an Indegree of one, and an Outdegree of one.

These centrality scores are displayed in the Degree, Indegree, and Outdegree fields in the Centrality table.

Learn more about an entity's degree

Eigenvector centrality

The Eigenvector centrality score reflects not only how many connections an entity has, but how important those connections are within the graph. For example, if the Person A entity in a graph representing a social network has several connections and some connected entities are very influential in the graph, Person A's Eigenvector score is higher as a result. However, if Person B has the same number of connections as Person A, but the entities to which Person B is connected have few additional connections, Person B's score is lower as a result.

This score is calculated recursively. Each entity starts with the same score. With each pass, an entity's score is updated to account for the score of its immediate neighbors. This measure of centrality doesn't account for the fact that Person A and Person C may be connected to each other in many ways. They both have a higher Eigenvector score because of the many connections between them.

When relationship direction is accounted for, this Eigenvector method of calculating centrality can have an effect where the scores are effectively drained from some parts of the graph and pooled in other areas, which is known as the starvation effect. Some entities can end up with a score of zero.

This score is displayed in the Eigenvector field in the Centrality table.

PageRank centrality

The PageRank centrality score is calculated in a similar manner to the Eigenvector score. As with Eigenvector, an entity's score reflects not only how many connections it has, but also how many connections those neighbor entities have within the graph. An entity has a higher score when it has many connected entities that in turn have many connections.

However, the PageRank calculation normalizes how scores are propagated from an entity to its neighbors at each pass. An entity's score is divided by the number of neighbor entities it has. When relationship direction is accounted for, use the PageRank score instead of the Eigenvector score to determine which entities are most important in the graph because the PageRank score mitigates the starvation effect.

This score is displayed in the PageRank field in the Centrality table.

Betweenness centrality

The Betweenness centrality score is calculated based on the shortest path between an entity and other entities in the graph. Shortest paths are calculated between all entities in the link chart. If many shortest paths run through an entity, it has a high Betweenness score. For example, in a link chart representing a social network, this score can determine how effective a Person entity can be in passing information to all other Person entities on the chart.

The shortest paths can be significantly different when the direction of a relationship is considered. For example, information may flow away from some Person entities in a different manner than information flows back to the same Person.

This score is displayed in the Betweenness field in the Centrality table.

Closeness centrality

The Closeness centrality score is also calculated based on the shortest path between an entity and other entities in the graph. It considers the length of the shortest paths. An entity's score is determined by the average length of the shortest path from the entity to other entities in the graph. In a link chart representing a social network, if a Person entity has a high Closeness score, that entity can communicate more directly to other Person entities on the chart.

This score is displayed in the Closeness field in the Centrality table.

Harmonic centrality

The Harmonic centrality score is calculated in a similar manner to the Closeness centrality score. However, instead of determining the shortest path between an entity and other entities on the link chart, the Harmonic centrality score is calculated by considering the inverse distance of the shortest path. Harmonic centrality gives better results when the graph has disjoined subgraphs because there are portions of the graph represented on the link chart that cannot be reached.

This score is displayed in the Harmonic field in the Centrality table.

Open the Centrality table

Centrality scores for the entities in a link chart are provided in a Centrality table Centrality. The table has one row for each entity. Fields in the table show a entity's display name, its entity type, and the score calculated for each measure of centrality.

In a field that displays a centrality score, each cell in the table has a dynamic blue histogram bar representing the entity's score relative to other entities in the link chart. The length of the bar is proportional to the maximum score in that field.

You can interact with the Centrality table in the same manner as you interact with attribute tables associated with the feature layers in a map. For example, you can click the heading of a field to sort the rows in the table using the values in that field. Sorting the Centrality table using a field containing centrality scores allows you to visualize the histogram of values for that centrality measure.

You can click one or more rows in the Centrality table to select them and the corresponding entities on the link chart are selected. Similarly, selecting entities on the link chart will select the corresponding rows in the table.

View centrality scores for the entities in a link chart by opening its Centrality table, and explore the table's contents following the steps below:

  1. On the Link Chart tab on the ribbon, in the Analyze group, click Centrality Centrality

    The Centrality table Centrality opens. The name that appears on the tab for the Centrality table view identifies the link chart for which the centrality scores were calculated. Rows in the table are sorted using the value in the Degree field by default.

    A link chart's Centrality table describes which entities are most important for the graph depicted by the link chart.

  2. Click a row in the Centrality table to select it.

    The corresponding entity or entity group is selected on the link chart.

    Select a row in the Centrality table to see the corresponding entity on the link chart.

  3. Click the heading for another field in the Centrality table to sort rows using that field.

Change how centrality scores are calculated

You can adjust how centrality scores are calculated using options that appear in the toolbar at the top of the Centrality table view.

Include documents

By default, Document entities and the HasDocument relationships are not considered when centrality scores are calculated even if they are present on the link chart. You can include Document entities in the calculations if you prefer. For example, this can allow you to determine the influence of a particular source on the entities or on your investigation when the source Document is present on the link chart.

  1. Check or uncheck the Include documents option in the toolbar at the top of the Centrality table.
    • Checked—Include Document entities when calculating centrality scores. Centrality scores are calculated for each Document entity on the link chart, and any HasDocument relationships will influence the centrality scores of entities associated with a Document.
    • Unchecked—Exclude Document entities when calculating centrality scores. An entity's HasDocument relationships will not influence the its centrality scores. This is the default setting.

The rows in the Centrality table are updated automatically to reflect the changes to this setting. Rows representing Document entities are added to or removed from the table and centrality scores are recalculated automatically.

Consider the direction of relationships

Relationships have an inherent direction that is depicted on a link chart, from the origin entity to the destination entity. By default, a relationship's direction is not considered when centrality scores are calculated for the link chart. However, centrality scores can be calculated in a manner that considers the direction of relationships when the direction is relevant to your investigation.

  1. Click the Relationships drop-down list in the toolbar at the top of the Centrality table.
  2. Click an option to determine how a relationship's direction is considered when centrality scores are calculated.
    • Two-way Two-way—Ignore the direction of relationships when calculating centrality scores. This is the default setting.
    • One-way One-way—Use a relationship's direction when calculating centrality scores.
    • One-way Reversed One-way Reversed—Use the reverse of a relationship's direction when calculating centrality scores. Scores are calculated as if the relationship's destination is the origin and the origin is the destination.

The scores in the Centrality table are updated automatically to reflect the changes to this setting.

Consider many relationships between two entities

The Cumulate weights option determines how relationships are considered when many relationships exist between two entities. Each relationship can be considered individually, or the set of relationships can be considered as if it were one relationship. This setting does not have any impact on centrality scores when only one relationship exists between two entities.

When the set of relationships between two entities are considered as one relationship, the result is the same as if only one relationship existed between the two entities. When the direction of a relationship is considered, the direction applies in how the set of relationships are handled. All relationships going in the same direction between two entities are considered as one relationship.

If several relationships are grouped on the link chart, the group has no bearing on how centrality scores are calculated. Each relationship in the group is considered or not based on the Cumulate weights setting.

  1. Check or uncheck the Cumulate weights option in the toolbar at the top of the Centrality table.
    • Checked—When many relationships exist between two entities, each relationship is considered separately and all relationships contribute to the centrality scores of both entities. This is the default setting.
    • Unchecked—When many relationships exist between two entities, all relationships between the entities are considered together as if there were only one relationship. The existence of many relationships does not have an impact on the centrality scores of both entities.

The scores in the Centrality table are updated automatically to reflect the changes to this setting.

Normalize centrality scores

When centrality scores are calculated, they are calculated for the link chart as a whole and the raw scores are displayed in the table by default.

However, with the Normalize globally option you can normalize centrality scores for the whole link chart by representing them as a ratio of the maximum score. For example, if the maximum Degree score is six, the entity with the score of six will have the value 1.0 in the Centrality table's Degree field, and other entities will have scores less than one.

Sometimes when you build a link chart, it will have disjoined subgraphs—two unconnected networks of entities and relationships. A link chart like this will have lower overall centrality scores because entities in one subgraph can't reach entities in the other subgraph, and therefore can't be more central to the link chart as a whole. For this scenario, you can use the Normalize by component option to calculate centrality scores that are normalized within each subgraph. This setting has the same result as the Normalize globally setting if there is only one graph represented on the link chart.

  1. Click the Scores drop-down list in the toolbar at the top of the Centrality table.
  2. Click an option to determine how a relationship's centrality score is normalized.
    • Raw—Centrality scores are not normalized. This is the default setting.
    • Normalize globally—Centrality scores are normalized relative to the scores calculated for all entities and relationships in the link chart.
    • Normalize by component—Centrality scores are normalized relative to the entities and relationships that participate within each disjoined subgraph.

The scores in the Centrality table are updated automatically to reflect the changes to this setting.

View scores for specific entity types

By default, the Centrality table includes a row for every entity present on the link chart. For large link charts, the table can show too much information to process. You can filter the rows displayed in the table to show only scores for specific entity types.

By default, all the link chart entities display with their scores in the Centrality table.

  1. Click the Types button Type Filters in the toolbar at the top of the Centrality table.

    A drop-down list appears that includes all entity types in the knowledge graph even if entities of that type are not present on the link chart. All entity types are checked by default.

  2. In the drop-down list, check entity types you want to see in the Centrality table. Uncheck entity types you do not want to see in the table. Start typing the name of an entity type or a relationship type if you can't see it in the list; the list of entity types is filtered automatically, and you can check or uncheck entity types in the filtered list.

    The total number of selected entity types appears on the toolbar next to the Types button Type Filters.

The rows in the Centrality table are updated automatically. Rows representing entities of the unchecked entity types are removed. Rows representing entities of the checked entity types are added.

Update scores in the Centrality table

You can add content to a link chart or remove content from a link chart at any time. You can add a disjoint subgraph to a new link chart and remove it from the existing one. You can also edit data in the knowledge graph associated with the link chart by creating a relationship or merging two entities. You can also group entities on the link chart.

If you opened the Centrality table for the link chart before you modified its content, the centrality scores in the table are not updated automatically to include rows for entities added since the table was opened. Similarly, rows for entities removed from the link chart or grouped together are not removed from the table automatically. A warning message at the bottom of the Centrality table will indicate its scores are out of date.

When you have finished making changes to the link chart, follow the steps below to update the scores in the Centrality table to reflect the graph currently depicted by the link chart:

  1. Open the Centrality table.
  2. Make the link chart associated with the Centrality table the active view.
  3. Add or remove some content from the link chart.

    The Scores are out of date warning appears at the bottom of the Centrality table.

  4. Make the Centrality table Centrality the active view.
  5. Click the Update Centrality Scores button Update Centrality Scores at the bottom of the table next to the message.

The scores in the Centrality table are recalculated.

Related topics