Fundamentals of the geodatabase

The geodatabase is a collection of geographic datasets of various types.

Here, you can learn about the fundamentals of the geodatabase. These concepts will help provide a foundation for learning about and effectively using geodatabases for your GIS work.

Fundamental datasets in the geodatabase

A key geodatabase concept is the dataset. It is the primary mechanism used to organize and use geographic information in ArcGIS. The geodatabase contains three primary dataset types:

  • Feature classes
  • Raster datasets
  • Tables

Creating a collection of these dataset types is the first step in designing and building a geodatabase. Users typically start by building a number of these fundamental dataset types. Then they add to or extend their geodatabases with more advanced capabilities (such as by adding topologies, networks, or subtypes) to model GIS behavior, maintain data integrity, and work with an important set of spatial relationships.

Geodatabase storage in tables and files

Geodatabase storage includes both the schema and rule base for each geographic dataset plus simple tabular storage of the spatial and attribute data. All three primary datasets in the geodatabase (feature classes, attribute tables, and raster datasets), as well as other geodatabase elements, are stored using tables. The spatial representations in geographic datasets are stored as either vector features or rasters. These geometries are stored and managed in fields along with traditional attributes.

A feature class is stored as a table. Each row represents one feature. In the following polygon feature class table, the Shape field holds the polygon geometry for each feature. The Polygon value is used to specify that the field contains the coordinates and geometry that define one polygon in each row.

Feature classes stored as a table; each row holds a feature.

A key geodatabase strategy is to use the database management system (DBMS) to scale GIS datasets to extremely large sizes and numbers of users, for example, to support simple small databases for one or a few users up to instances with hundreds of millions of features and thousands of simultaneous users. Tables provide the primary storage mechanism for geographic datasets. Structured Query Language (SQL) is strong at querying and processing the rows in tables, and the geodatabase strategy is designed to use these capabilities.

Advanced geographic datasets extend feature classes, rasters, and attribute tables

Various geodatabase elements are used to extend simple tables, features, and rasters to model spatial relationships, add rich behavior, improve data integrity, and extend the geodatabase's capabilities for data management.

The geodatabase schema includes the definitions, integrity rules, and behavior for each of these extended capabilities. These include properties for coordinate systems, coordinate resolution, feature classes, topologies, networks, relationships, and domains. This schema information persists in a collection of geodatabase meta tables in the DBMS. These tables define the integrity and behavior of the geographic information.

Geodatabase elements

All GIS users will work with three fundamental dataset types regardless of the system they use. They'll have a set of feature classes, a number of attribute tables, and most of the time, they'll also have a large set of imagery and raster datasets to work with.

The three primary types of datasets in GIS

Fundamentally, all geodatabases will contain this same kind of content. This collection of datasets can be thought of as the universal starting point for your GIS database design.

As necessary, users can extend their data models to support certain essential capabilities. The geodatabase has a number of additional data elements and dataset types that can be used to extend this fundamental collection of datasets.

See Extending tables, Extend feature classes, and Imagery and remote sensing in ArcGIS for more information.

Geodatabase transactions and versioning

Enterprise geodatabases use capabilities in the underlying DBMS to provide versions that offer scalable support for multiuser editing on large databases. With versioning, each editor can work in their own personal version of the geodatabase, make edits without impacting other editors or the production database, and incorporate their changes back into the system upon completion of their work. This long transaction framework accommodates a wide variety of data management strategies suitable for individual users, teams of people, all the way up to massive international organizations and full WebGIS deployments.

See Data management and transactions for more information.