Load data into a knowledge graph

You can create entities and relationships in a knowledge graph to represent existing tabular data. Data in a table's fields can be converted into properties of entities and relationships. A feature class's geometry can be loaded into an entity's spatial feature.

Define how tabular data will be converted to knowledge graph items using the Load Table wizard. As you define the conversion process, you can save your work. The instructions for converting data are saved in the investigation in the current project as a data loading configuration. Existing data loading configurations can be modified to accommodate similar tables and feature classes.

When you run the conversion process, data is loaded into the current investigation's knowledge graph. A new link chart can be created to view and evaluate the results of the conversion process.

A data loading configuration specifies how to read and handle the data in a table. The load table process creates entities and relationships in a knowledge graph to represent the data. It can also create provenance records to specify the source material from which the data was derived.

Open the Load Table wizard

Open the Load Table wizard to import tabular data to an investigation's knowledge graph.

  1. Open an investigation in ArcGIS Pro.
  2. On the Investigation tab on the ribbon, in the Load Data group, click Load Table Load Table.

The Load Table wizard appears in a new view alongside its investigation.

Identify the table to load

Specify the table or feature class that contains the data to be converted on the Welcome page in the wizard. If you have an existing data loading configuration that can serve as the basis for this operation, you can import its information to the wizard.

Data loading configurations can only be imported to the wizard from the current investigation. You can copy data loading configurations from another investigation, including from a different project, and import its information.

  1. Click the Browse button Open next to the Source Table text box.
  2. On the dialog box that appears, browse to and click the table or feature class that contains the data to be loaded into the investigation's knowledge graph, and click OK.
  3. If the investigation has a data loading configuration with relevant information for importing the current table's data, click the Data Loading Configuration drop-down arrow and click the appropriate data loading configuration Load Configuration.
  4. Click Next to advance to the Entities page in the wizard.
Tip:

The source table or feature class and an initial data loading configuration can also be specified using the buttons on the Load Table tab on the ribbon. You can change these settings at any point, but you will lose any settings previously defined in the wizard.

Define entities to create in the knowledge graph

Identify fields in the source table or feature class where the data represents entities in the knowledge graph. The source table's fields appear in the Column Names list. A separate Entities table allows you to define how the data in each field is associated with an entity type in the knowledge graph.

For example, if the source table describes a company's employees, you can import that information to the knowledge graph and create one Employee entity for each row in the table. The field you choose to represent the entity should be a unique identifier, such as an employee ID number, for an Employee entity type. Other properties such as an employee's full name are not guaranteed to be unique.

The column names list is divided into used and unused fields. When you start with a new data loading configuration, all fields are in the unused list. If an ID field in the table contains employee ID numbers and you use it to define entities in the graph, the field moves to the used list. If you start with an existing data loading configuration, fields in the table are likely in use on various pages of the wizard—fields in use on any page appear in the used part of the column names list.

Several tables may contain employee information for different offices. After importing one table, when preparing to import the second table, you must consider employees that appear in multiple tables such as a regional sales manager. If you import both tables without specifying how to merge entities, you will end up with two entities in the knowledge graph for the same regional sales manager. You can avoid creating duplicates if the table has data that uniquely identifies an entity, and if existing entities in the knowledge graph have a property containing the same information.

If you choose to merge entities, the data loading process will compare the employee ID number stored in the table's field to the Employee entity type's ID property for each existing Employee instance in the knowledge graph. If an entity of the same type with the same identifier exists in the knowledge graph, data from the table is stored in or associated with the existing knowledge graph entity instead of creating a new instance. That is, entities in the table and entities in the knowledge graph are merged.

In general, it is better to merge entities during the data loading process, if possible. Even if many entities can be merged, errors or gaps in the table or the knowledge graph can prevent some entities from being merged. In this situation, you can merge entities manually after the load table process is complete.

Learn more about merging entities

  1. Provide a key that identifies the individual entities that will be created or updated in the knowledge graph.
    • Double-click a field in the Column Names list where the data uniquely identifies the entities. The field name is added to the Entities table in the Entity Key column.
    • Click the empty row at the bottom of the Entities table. Click in the Entity Key column and type an identifier.
  2. Click in the Entity Type column in the Entities table and define the types for which entity instances will be created or updated.
    • For an entity type that has already been defined in the knowledge graph, click the entity type in the drop-down list. Start typing in the text box to autocomplete an existing entity type's name.
    • To define a new entity type in the knowledge graph, type a name for the new type. New types are identified by an asterisk throughout the wizard. Use existing entity types whenever possible.
    • If a field in the source table specifies the type of entity to create, click the Column option at the bottom of the drop-down list. The list changes to show all fields in the source table. Click the field whose data specifies the type of entity to create in the knowledge graph.
    • To specify the entity type using an ArcGIS Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that indicates the type of entity to create and click OK.

    If an entity is defined using a field in the table, the field moves from the unused to the used section of the Column Names list.

  3. Optionally, check in the Merge column to merge entities if possible.

    When merging is enabled, input data is compared with existing entities in the knowledge graph. If the input data doesn't correspond to an existing entity, a new instance of the entity type is created in the knowledge graph. Additional rules must be specified on the Properties page in the wizard to define which entity type properties will be used to merge entities.

  4. Repeat the steps above to define additional entities.

    To define a relationship in the knowledge graph, both the origin entity type and the destination entity type of the relationship must be specified in the Entities table.

  5. Click Next to advance to the Relationships page in the wizard.
Tip:

At any time, you can save the rules you have defined as a data loading configuration in the investigation. On the Load Table tab on the ribbon, in the Configuration group, click the Save drop-down arrow and click Save Save or Save As Save As. Choose an existing data loading configuration or type a name to create a new configuration. Because the investigation and the data loading configuration are saved in the project, your changes aren't fully saved until you save the project as well.

Define relationships to create in the knowledge graph

You can establish relationships between entities using the data in the source table or feature class. To create relationships as part of the data loading process, both the origin entity and the destination entity of the relationship must be defined on the Entities page in the wizard. Relationships are described in one direction, from an origin entity to a destination entity.

For example, in a table describing an organization's employees, a field can define the employee's manager. You can define a WorksFor relationship type to capture the association between the employee and their manager. In a WorksFor relationship, both the origin and destination entities can have the Employee entity type. As each row of the table is processed, the current employee is the origin entity of the relationship and the employee who is their manager is the destination entity of the relationship.

As with entities, a relationship defined in the table can be merged with existing relationships in the knowledge graph. When merging relationships, the data loading process will compare the table's entities to existing entities in the knowledge graph and determine whether the origin and destination entities can be merged. If the entities can be merged, the knowledge graph will be examined to see whether a relationship of the same type already exists between the entities; if so, the relationship will be merged as well. For example, if the origin entity can be merged but the destination entity can't, a new relationship is created from the existing origin entity to a new destination entity.

In general, it is better to merge relationships during the data loading process, if possible. As with entities, errors or gaps in the table or the knowledge graph can prevent relationships from being merged. You can merge relationships manually after the load table process is complete.

Learn about merging relationships

  1. Click the empty row at the bottom of the table to define a new relationship.
  2. Click in the Origin Entity column, and click the entity type defined on the Entities page that is the origin of the relationship.
  3. Click in the Relationship Type column and define the relationship type for which new instances in the knowledge graph will be created.
    • For a relationship type that has already been defined in the knowledge graph, click the relationship type in the drop-down list. Start typing in the text box to autocomplete an existing relationship type's name.
    • To define a new relationship type in the knowledge graph, type a name for the new type. New types are identified by an asterisk throughout the wizard. Use existing relationship types whenever possible.
    • If a field in the source table specifies the type of relationship to create, click the Column option at the bottom of the drop-down list. The list changes to show all fields in the table. Click the field whose data specifies the type of relationship to create in the knowledge graph.
    • To specify the relationship type using an Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that indicates the type of relationship to create and click OK.
  4. Click in the Destination Entity column, and click the entity type defined on the Entities page that is the destination of the relationship.
  5. Optionally, check in the Merge column to merge relationships if possible.

    When merging is enabled, input data is compared with existing entities and relationships in the knowledge graph. If the input data doesn't correspond to an existing relationship, a new instance of the relationship is created in the knowledge graph. Additional rules must be specified on the Properties page in the wizard to define which relationship type properties will be used to merge relationships.

  6. Click Next to advance to the Properties page in the wizard.

Define properties to create in the knowledge graph

Data from the source table can be stored in properties of entities and relationships in the knowledge graph. To accomplish this, several lists of information are presented for you to work with. There is a list of all fields in the source table or feature class, a list of entities defined on the Entities page in the wizard, and a list of all relationships defined on the Relationships page in the wizard.

  • The Column Names list is divided in two: Unused and Used fields. Fields that identify entities as defined on the Entities page are in use. Remaining fields in the table appear in the unused list.
  • In the Entities list, the entity type is in parentheses, followed by the field name. For example, when an ID field in the table stores an employee number that uniquely identifies the Employee entity, the list shows: (Employee) ID.
  • In the Relationships list, each entry shows the relationship's origin entity, relationship type, and destination entity. For example, when the origin and destination are the Employee entity and WorksFor is the relationship type, the list shows: (Employee) ID WorksFor (Employee) ID.

The above three lists are used to define the rules for converting the source table's data to properties of entities and relationships. When you select an entity or relationship, the Properties table shows the rules already defined.

For each entity type and relationship type, add rules that define how data in the source table or feature class will be loaded into properties of an entity type or relationship type. You can also provide text that will be stored in a property for every instance of an entity or relationship that is created or updated.

To merge entities and relationships defined in the table with existing entities and relationships in the knowledge graph, check all property rules that will be used to uniquely identify an instance. For example, if an ID field in the source table uniquely identifies Employee entities, check Use For Merge to create new Employee entities or update existing Employee entities as appropriate. If a Person entity is uniquely identified by a combination of the person's name and birth date, define rules to associate the appropriate fields in the source table to fullName and birthDate properties of the Person entity and check Use For Merge for both rules.

  1. The first entity in the Entities list is selected by default, and the rule that defines this entity appears in the Properties table. Continue with this entity or select another graph item.
  2. Click the empty row at the bottom of the Properties table.
  3. Click in the Property Name column and define a property for the selected graph item.
    • For a property that has already been defined in the knowledge graph, click the property in the drop-down list. Start typing in the text box to autocomplete an existing property name.
    • To define a new property in the knowledge graph, click in the text box and type a name for the new property. New properties are identified by an asterisk throughout the wizard. Use existing properties whenever possible.
  4. If you are defining a new property, click in the Data Type column and click the appropriate data type.

    For existing properties, the data type appears automatically in the Data Type column and you cannot change the type.

  5. Click in the Property Value column and define a rule for storing data in the graph item's property.
    • Click the field in the table whose data will be stored in the property of the entity or relationship. The field will move from Unused to Used in the Column Names list.
    • Type a value in the text box that will be saved in this property for all graph items defined in the table. The value will be validated against the property's data type; if text is provided for an integer property, the value won't be stored.
    • To specify the value using an Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that returns the value that will be stored in the property and click OK.
  6. Check in the Use For Merge column to include the property when comparing data in the table with existing items in the knowledge graph and determine whether items can be merged.

    Field values in each row of the source table are compared to the associated properties of an entity or a relationship in the knowledge graph. If all field values match all item properties, the item described by the table can be merged with the existing entity or relationship in the knowledge graph. The existing graph item is updated with additional information specified in the source table.

    If all field values do not match all item properties, items described by the table cannot be merged with existing graph items in the knowledge graph. A new entity or relationship is created in the knowledge graph using data from the source table.

  7. Define rules to load data into additional properties of the selected entity or relationship.
  8. Select another graph item in the Entities or Relationships list and define additional rules to convert data in the source table to graph item properties in the knowledge graph.
  9. Click Next to advance to the Spatial page in the wizard.
Note:

You can drag a field from the Column Names list to the Properties table. The field appears automatically in the Property Value column. If the field name and property name match, the property name appears automatically in the Property Name column. If the field name and property name don't match, the name of the field will appear in the Property Name column as the name of a new property that will be created for the graph item.

Define spatial features for entities

Identify fields in the source table or feature class that contain spatial data representing entities in the knowledge graph. The entities defined on the Entities page in the wizard are listed in the Entity column of the table.

Only the WGS84 coordinate system is supported for knowledge graphs. All features and coordinates will be assumed to use this coordinate system.

When the source data is a feature class, identify the geometry field in the attribute table. When the source data is a table, identify one or more fields containing coordinates that define the entity's geometry. Coordinates are evaluated in the same manner as LocateXT. For example, when coordinates are provided in a UTM format, the coordinates are converted to a point geometry and associated with an entity in the knowledge graph.

When coordinates are provided in decimal degrees format, they are often specified by providing the latitude value first followed by the longitude value. However, if coordinates are provided in the more mathematical x,y format, you must check the option to process the coordinates as longitude, latitude instead to produce a valid geometry.

  1. Check in the Create Spatial column for each entity if data in the table can be used to define a spatial feature.

    For existing entities, the data type appears in the Geometry Type column.

  2. For new checked entities, with an asterisk next to the entity name, click the drop-down arrow and click the appropriate geometry type.

    Spatial features are stored in the property of the entity indicated in the Spatial Property Name column. The property name cannot be changed.

  3. For each checked entity, define how its spatial feature will be created. Click the entity in the table. Below the table, the Input Spatial Format drop-down list appears. Click the drop-down list and click the appropriate option for the spatial data stored in the source table or feature class.
    • Geometry—The shape stored in the geometry field of the source feature class will be saved to the entity in the knowledge graph. Use the Input Geometry Field drop-down list that appears to identify the geometry field. Only fields storing shapes defined in the ArcGIS geometry format can be stored in the knowledge graph.
    • Coordinates—The entity's spatial feature will be constructed from data in the table. Specify fields containing the feature's coordinates. The fields in the source table appear in the Available Fields list. For each field that stores spatial coordinates, click the field and click Add; the field is added to the Selected Fields list. If appropriate for the data stored in the table, check the Interpret as longitude, latitude option.
  4. Click Next to advance to the next page.

    If provenance has been enabled for the knowledge graph, the wizard will advance to the Provenance page. Otherwise, the it will advance to the Review and Run page.

Define provenance records for property values

You can create provenance records to establish where data in the knowledge graph came from, if this capability has been enabled for your knowledge graph.

All properties of entities and relationships defined on the Entities page and the Relationships page in the wizard are listed at the top of the Provenance page. You can create provenance records to define the source of values stored in these properties.

Provenance records can associate property values with source information stored in a field in the table you are using. You can also type a URL, file path, or text describing the source material for a set of property values. For example, if all values for a given property came from the same website, you can create a provenance record for each value that references the website's URL. If you are using information in the table to create new Document entities, you can also create provenance records that reference those documents.

If the knowledge graph already has provenance records, you can use them as templates for new provenance records. For example, other data that was previously loaded into the knowledge graph may have been derived from the same website. By using the existing provenance record as a template, you can avoid spelling errors and ensure consistency between provenance records created at different times.

Select one or more properties in the Entity/Relationship Properties table. In the Provenance table below, define how provenance records will be created for each value in each selected property.

  1. Select a property in the Entity/Relationship Properties table.

    The entity type or relationship type is shown in the Entity/Relationship column and the property is shown in the Property column.

    • Click a row to select the property it describes.
    • Press Shift while clicking another row in the list to select many properties described by adjacent rows in the table.
    • Press Ctrl while clicking additional rows in the list to select several specific properties that are not adjacent to each other in the table.
  2. Click the empty row in the Provenance table at the bottom of the page to define a new provenance record.
  3. Click in the sourceType column and define the type of source information in the control that appears.

    Three source types are supported. The provenance record's source can be a document in the knowledge graph, a website or file identified by a URL or file path, or text that defines the source material or how to access it.

    • The Values option is selected by default at the bottom of the control. Click Document, URL, or String. Each provenance record will have the same type in its source type property.
    • Click the Column option at the bottom of the control if the source type is defined in a field in the table. The control changes to list all fields in the table. Click the field whose data specifies the provenance record's source type. Each provenance record will have the source type in the selected field for the appropriate row in the table.
    • To specify the source type using an Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that indicates the provenance record's source type and click OK.
  4. Click in the source column and define the provenance record's source in the control that appears.
    • The Values option is selected by default at the bottom of the control. When the source is a URL or text that is used in other provenance records in the knowledge graph, type a portion of the value. Existing provenance records in the knowledge graph are searched and all matching sources are listed. Hover over a value in the search results list to examine the properties of an existing provenance record. Click the appropriate value in the list. The existing provenance record is used as a template—the source and all other columns in the Provenance table are automatically populated for this provenance record. Each provenance record will have the same text in its source property.
    • When the source is a URL or text that has not been used in other provenance records, type the new value. The existing provenance records are searched, but no matching values are found. Click the New Provenance button at the bottom of the control. Each provenance record will have the same text in its source.
    • Click the Column option at the bottom of the control if the source is defined in a field in the table. The control changes to list all fields in the table. Click the field whose data specifies the provenance record's source. Each provenance record will have the source in the selected field for the appropriate row in the table.
    • When new Document entities are defined on the Entities page and the source is one of those documents, click the Entities option at the bottom of the control. The control changes to list all new Document entities defined in the Entity/Relationship Properties table. Click the document entity that is the provenance record's source. Each provenance record will reference the appropriate Document entity as its source.
    • To specify the source using an Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that indicates the source value for the provenance record and click OK.
  5. Optionally, click in the sourceName column and define a title for the source in the control that appears.
    • The Values option is selected by default at the bottom of the control. If a template value has been provided and you want to use it, skip this step. Otherwise, type a new title. Each provenance record will have the same text in its source name property.
    • Click the Column option at the bottom of the control if the title is defined in a field in the table. The control changes to list all fields in the table. Click the field whose data specifies the title. Each provenance record will have the source name in the selected field for the appropriate row in the table.
    • To specify the title using an Arcade expression, click the Expression Builder option Set an expression at the bottom of the drop-down list. Use the Expression Builder dialog box to create an Arcade expression that returns a title that will be stored in the source name property of the provenance record and click OK.
  6. Optionally, click in the comment column and define additional information about the source material in the control that appears.

    Follow the same procedure used with the sourceName field.

  7. Optionally, click Modify Schema Modify Schema to add custom properties for provenance records in the knowledge graph.
    1. In the fields view that appears, click in the empty row at the bottom of the table.
    2. Provide a name for the new property and define its data type.
    3. On the Fields tab on the ribbon, in the Manage Edits group, click Save.
    4. Close the fields view.

    The new provenance record properties appear in the Provenance table.

  8. For any additional provenance record properties that appear in the Provenance table, provide the appropriate values.

    Follow the same procedure used with the sourceName field.

  9. Click Next to advance to the Review and Run page in the wizard.

Review the configuration and load the data

The Review and Run page in the wizard displays a summary of entities, relationships, and properties that will be created in the knowledge graph. The Run button remains unavailable until all problems identified in the wizard are corrected.

  1. Review all the rules for loading data from the source table or feature class to entities, relationships, properties, and spatial features in a knowledge graph.
  2. To save the final set of rules for future use, check Save configuration. Click the drop-down arrow for the list that appears, and click the name of an existing data loading configuration to overwrite it. Alternatively, type a name to store the rules in a new data loading configuration.
  3. Check Display the result in a new link chart upon completion to add the entities and relationships created by the data conversion process to a new link chart for your review.
  4. Click Run Run.

    If any errors are found during the data loading process, a warning appears at the bottom of the page. Click the View Details link in the warning to show the list of errors in a dialog box. For example, a null value in a table or property or a data type mismatch prevents two entities from being merged and results in an error.

    To keep the list and review it later as you evaluate the results of the conversion process, click the Copy button Copy. Open a text editor and paste the copied messages, and save them to a text file.

  5. Close the Load Table view.

After the data loading operation is complete, update any open investigations, maps, and link charts to see the current entities and relationships and their properties in the knowledge graph. Review and edit the graph items in the knowledge graph to merge entities and relationships that should not be separate, delete entities and relationships created in error, or modify properties as needed.

If you chose to save the data loading configuration to your investigation, you can export it to a data loading configuration file to use it in another project and archive it with the source table or feature class. Save the project to save the data loading configuration as part of your current investigation.

Related topics