Describe Dataset (GeoAnalytics)

Summary

Summarizes features into calculated field statistics, sample features, and extent boundaries.

Illustration

Describe Dataset workflow diagram

Usage

  • This geoprocessing tool is available with ArcGIS Enterprise 10.7 or later.

  • The following are examples of what you can do with the Describe Dataset tool:

    • Verify that you have correctly registered time and geometry with your big data file share.
    • Understand attribute values with summarized field statistics.
    • Visualize your big data with a sample layer. Instead of drawing a million features, draw a sample.
    • Run workflows using a sample of the data before scaling out for longer and larger processing.
    • Determine where a dataset is by calculating the geographical extent.

  • By default, the tool outputs a table containing the summary statistics for each of the fields in the input layer. In addition, a table is printed to the geoprocessing window describing any geometry or time properties of the input layer.

    If the input layer has geometry, the tool prints a table describing the following geometry properties of the input layer:

    • Geometry type—The geometry type of the input layer. This value is point, line, or polygon.
    • Spatial reference—The spatial reference of the input layer.
    • Count of non-empty features—The number of features that have a valid geometry within the extent of the spatial reference of the input layer.
    • Count of empty features—The number of features that do not have a valid geometry. These features may have an empty geometry, or the geometry may be outside of the extent of the spatial reference being used.
    • Spatial extent—The spatial extent of the features in the input layer.

    If the input layer has time enabled, the tool prints a table describing the following time properties of the input layer:

    • Time type—The time type of the input layer. This value is instant or interval.
    • Count of non-empty features—The number of features that have a valid time value.
    • Count of empty features—The number of features that have a null or invalid time value.
    • Temporal extent—The temporal extent of the features in the input layer. This value contains a start time and end time.

  • Use the Number of Sample Features parameter to specify the number of features to sample. If you leave it blank or select 0, no sample will be created. The output subset will have the same schema, geometry, and time settings as the input features. The subset can be used to understand how your datasets appear when added to a map or visualized in an attribute table. Additionally, you can run analysis on the subset to determine the best inputs for larger analysis.

  • If you specify a sample size greater than the total number of input features, all features will be returned.

  • The sample layer does not represent a truly random geographic selection and should not be used to understand the geographic extent or distribution of your data. For example, if you specify 230 features for Number of Sample Features, the result can contain 230 input features in any order or location.

  • Create a boundary feature that describes the extent of your input dataset using the Create Extent Layer parameter. The output will include a single polygon feature representing the geographic extent of the input features. The extent layer can be used to determine where your data is stored, or use it as an input elsewhere in your workflow. For example, use it as the polygon layer to clip features to using the GeoAnalytics Clip Layer tool.

  • Optionally use environment settings to specify how features will be output.

    For example, use the Extent environment to output an extent layer representing the area of interest, or sample features from the defined study area.

    Additionally use the Output Coordinate System environment to project outputs to the desired spatial reference.

  • The Create Extent Layer parameter is only supported for point, line, and polygon features. An extent layer will not be created for tabular features.

  • This geoprocessing tool is powered by ArcGIS GeoAnalytics Server. Analysis is completed on your GeoAnalytics Server, and results are stored in your content in ArcGIS Enterprise.

  • When running GeoAnalytics Server tools, the analysis is completed on the GeoAnalytics Server. For optimal performance, make data available to the GeoAnalytics Server through feature layers hosted on your ArcGIS Enterprise portal or through big data file shares. Data that is not local to your GeoAnalytics Server will be moved to your GeoAnalytics Server before analysis begins. This means that it will take longer to run a tool, and in some cases, moving the data from ArcGIS Pro to your GeoAnalytics Server may fail. The threshold for failure depends on your network speeds, as well as the size and complexity of the data. Therefore, it is recommended that you always share your data or create a big data file share.

    Learn more about sharing data to your portal

    Learn more about creating a big data file share through Server Manager

Syntax

arcpy.geoanalytics.DescribeDataset(input_layer, output_name, {sample_features}, {create_extent_layer}, {data_store})
ParameterExplanationData Type
input_layer

The point, line, polygon, or tabular features to be described.

Record Set
output_name

The name of the output feature service.

String
sample_features
(Optional)

The number of features that will be included in the output sample layer. No sample is returned if you select 0 features or don't provide a number. By default, no sample layer is returned.

Long
create_extent_layer
(Optional)

Specifies whether an output extent layer will be created. The extent is a polygon that represents the spatial and temporal extent of the input features.

  • CREATE_EXTENTAn extent layer will be created.
  • NO_EXTENTAn extent layer will not be created.
Boolean
data_store
(Optional)

Specifies the ArcGIS Data Store where the output will be saved. The default is SPATIOTEMPORAL_DATA_STORE. All results stored in a spatiotemporal big data store will be stored in WGS84. Results stored in a relational data store will maintain their coordinate system.

  • SPATIOTEMPORAL_DATA_STOREOutput will be stored in a spatiotemporal big data store. This is the default.
  • RELATIONAL_DATA_STOREOutput will be stored in a relational data store.
String

Derived Output

NameExplanationData Type
output

The output layer containing the summarized statistic calculations.

Record Set
extent_layer

When the create_extent_layer parameter is selected, the tool will output a layer containing a single polygon representing the extent of your dataset.

Feature Set
sample_layer

When the sample_features parameter specifies a value greater than zero, the tool will output a layer containing the specified number of sample features from your dataset.

Feature Set
output_json

This parameter is not used. A JSON string containing all of the summary information calculated in analysis is included in the tool's messages.

String

Code sample

DescribeDataset example (Python window)

The following Python window script demonstrates how to use the DescribeDataset tool.

In this script, network features are described, and a sample layer of 2500 features is created.

#-------------------------------------------------------------------------------
# Name: DescribeDataset.py
# Description: 
#
# Requirements: ArcGIS GeoAnalytics Server

# Import system modules
import arcpy

# Set local variables
inputDataset = "https://sampleserver.domain.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_MyBDFS/BigDataCatalogServer/networkDataset"
outputName = "my_network_described"
dataStore = "RELATIONAL"

# Execute Describe Dataset
arcpy.geoanalytics.DescribeDataset(inputDataset, outputName, 2500, "CREATE_EXTENT", dataStore)

Environments

Output Coordinate System

The coordinate system that will be used for analysis. Analysis will be completed in the input coordinate system unless specified by this parameter. For GeoAnalytics Tools, final results will be stored in the spatiotemporal data store in WGS84.

Licensing information

  • Basic: Requires ArcGIS GeoAnalytics Server
  • Standard: Requires ArcGIS GeoAnalytics Server
  • Advanced: Requires ArcGIS GeoAnalytics Server

Related topics