Skip To Content

Describe Dataset

Summary

Summarizes features into calculated field statistics, sample features, and extent boundaries.

Illustration

Describe Dataset workflow diagram

Usage

  • This geoprocessing tool is available with ArcGIS Enterprise 10.7 or later.

  • The following are examples of what you can do with the Describe Dataset tool:

    • Verify that you have correctly registered time and geometry with your big data file share.
    • Understand attribute values with summarized field statistics.
    • Visualize your big data with a sample layer. Instead of drawing a million features, draw a sample.
    • Run workflows using a sample of the data before scaling out for longer and larger processing.
    • Determine where a dataset is by calculating the geographical extent.

  • The tool will output both a table containing summary statistics for each field and a JSON describing the properties of the input layer by default.

  • Use the Number of Sample Features parameter to specify the number of features to sample. If you leave it blank or select 0, no sample will be created. The output subset will have the same schema, geometry, and time settings as the input features. The subset can be used to understand how your datasets appear when added to a map or visualized in an attribute table. Additionally, you can run analysis on the subset to determine the best inputs for larger analysis.

  • If you specify a sample size greater than the total number of input features, all features will be returned.

  • The sample layer does not represent a truly random geographic selection and should not be used to understand the geographic extent or distribution of your data. For example, if you specify 230 features for Number of Sample Features, the result can contain 230 input features in any order or location.

  • Create a boundary feature that describes the extent of your input dataset using the Create Extent Layer parameter. The output will include a single polygon feature representing the geographic extent of the input features. The extent layer can be used to understand where your data lives, or use it as an input elsewhere in your workflow. For example, use it as the polygon layer to clip features to using the GeoAnalytics Clip Layer tool.

  • Optionally use environment settings to specify how features will be output.

    For example, use the Extent environment to output an extent layer representing the area of interest, or sample features from within the defined study area.

    Additionally use the Output Coordinate System environment to project outputs to the desired spatial reference.

  • The Create Extent Layer parameter is only supported for point, line, and polygon features. An extent layer will not be created for tabular features.

  • This geoprocessing tool is powered by ArcGIS GeoAnalytics Server. Analysis is completed on your GeoAnalytics Server, and results are stored in your content in ArcGIS Enterprise.

  • When running GeoAnalytics Server Tools, the analysis is completed on the GeoAnalytics Server. For optimal performance, data should be available to the GeoAnalytics Server through feature layers hosted on your ArcGIS Enterprise portal or through big data file shares. Data that is not local to your GeoAnalytics Server will be moved to your GeoAnalytics Server before analysis begins. This means that it will take longer to run a tool, and in some cases, moving the data from ArcGIS Pro to your GeoAnalytics Server may fail. The threshold for failure depends on your network speeds, as well as the size and complexity of the data. Therefore, it is recommended that you always share your data or create a big data file share.

    Learn more about sharing data to your portal

    Learn more about creating a big data file share through Server Manager

Syntax

DescribeDataset(input_layer, output_name, {sample_features}, {create_extent_layer}, {data_store})
ParameterExplanationData Type
input_layer

The point, line, polygon, or tabular features to be described.

Record Set
output_name

The name of the output feature service.

String
sample_features
(Optional)

The number of features that will be included in the output sample layer. No sample is returned when you select 0 features or don't provide a number. This is the default.

Long
create_extent_layer
(Optional)

Specifies whether an output extent layer will be created. The extent is a polygon that represents the spatial and temporal extent of the input features.

  • CREATE_EXTENTAn extent layer will be created.
  • NO_EXTENTAn extent layer will not be created.
Boolean
data_store
(Optional)

Specifies the ArcGIS Data Store where the output will be saved. The default is SPATIOTEMPORAL_DATA_STORE. All results stored to the SPATIOTEMPORAL_DATA_STORE will be stored in WGS84. Results stored in a RELATIONAL_DATA_STORE will maintain their coordinate system.

  • SPATIOTEMPORAL_DATA_STOREOutput will be stored in a spatiotemporal big data store. This is the default.
  • RELATIONAL_DATA_STOREOutput will be stored in a relational data store.
String

Derived Output

NameExplanationData Type
output

The output layer containing the summarized statistic calculations.

Record Set
extent_layer

When the create_extent_layer parameter is selected, the tool will output a layer containing a single polygon representing the extent of your dataset.

Feature Set
sample_layer

When the sample_features parameter specifies a value greater than zero, the tool will output a layer containing the specified number of sample features from your dataset.

Feature Set

Code sample

DescribeDataset example (Python window)

The following Python window script demonstrates how to use the DescribeDataset tool.

In this script, network features are described, and a sample layer of 2500 features is created.

#-------------------------------------------------------------------------------
# Name: DescribeDataset.py
# Description: 
#
# Requirements: ArcGIS GeoAnalytics Server

# Import system modules
import arcpy

# Set local variables
inputDataset = "https://sampleserver.domain.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_MyBDFS/BigDataCatalogServer/networkDataset"
outputName = "my_network_described"
dataStore = "RELATIONAL"

# Execute Describe Dataset
arcpy.geoanalytics.DescribeDataset(inputDataset, outputName, 2500, "CREATE_EXTENT", dataStore)

Environments

Output Coordinate System

The coordinate system that will be used for analysis. Analysis will be completed in the input coordinate system unless specified by this parameter. For GeoAnalytics Tools, final results will be stored in the spatiotemporal data store in WGS84.

Licensing information

  • Basic: Requires ArcGIS GeoAnalytics Server
  • Standard: Requires ArcGIS GeoAnalytics Server
  • Advanced: Requires ArcGIS GeoAnalytics Server