Source, derived, and referenced mosaic datasets—ArcGIS Pro

When you need to manage large collections of imagery, it may be impractical to work with a single mosaic dataset to manage all of the imagery. Most workflows follow a pattern of using source and derived mosaic datasets. Sometimes referenced mosaic datasets are created as a subset. This pattern divides a potentially complex task into smaller tasks, and makes it easier to manage multiple sources, perform quality assurance of the mosaic datasets, and maintain the services.

Although you can create a single mosaic dataset from many collections of imagery, the best practice is to use a combination of mosaic datasets. This is described in the following sections and diagrammed in the graphic below.

Source, derived, referenced, and published mosaic datasets — Source data collections are stored in source mosaic datasets that can be combined into derived mosaic datasets, which are then used to create referenced mosaic datasets and published image services.

Source mosaic datasets

Source mosaic datasets are typically created for a subset of image collections from a large project, and combined into a derived mosaic dataset. For each collection of similar images, a source mosaic dataset is created, which represents a single manageable unit typically used for checking that metadata is defined correctly, defining specific processes to be applied, or performing quality assurance. Each record in the source mosaic dataset defines an image with specific metadata. For example, a source mosaic dataset can represent all imagery from a specific type of sensor, or represent imagery that was acquired as part of a discrete project that covers a known extent or period in time. The number of images in each source mosaic dataset typically ranges from tens to hundreds of thousands of images. Source mosaic datasets are generally not made accessible to end users or served as image services. The best practices for creating source mosaic datasets are described below.

All imagery in a source mosaic dataset should have the following:

A similar number of bands, bit depth, and type of metadata
A single raster type for the source imagery
Similar scales or pixel size (though possibly in different projections)

Typically, if modifications to the raster item in the mosaic dataset are required, such as clipping images to a footprint, applying a stretch, or orthorectification, they are defined and refined in the source mosaic dataset.

The spatial reference of a source mosaic dataset should be the best choice to encompass all imagery. For example, do not use a state plane projection to contain data across an entire country. Instead, use a projection suitable to contain the entire country's data. The imagery to be added to the source mosaic should be located within the extent horizon of the selected spatial reference system. If all the imagery has a single projection, typically the mosaic dataset is created in this projection.

The number of bands and bit depth of the source mosaic dataset are set to be suitable to contain all the data. For example, a source mosaic dataset with high-resolution satellite imagery, such as GeoEye-1, IKONOS, or QuickBird, is defined as 4 band, 16 bit.

Source mosaic datasets do not need to be static and can be updated with new imagery. In some workflows, source mosaic datasets are created manually. In others, the creation of source mosaic datasets may be fully automated, such as adding updated imagery periodically.

Overviews are typically computed for source mosaic datasets, and summary attributes are copied to the overview records. For example, if all the imagery is collected from a specific project, an attribute named ProjectID may be added to all the images, including the overviews. Later, if multiple source mosaic datasets are added to a derived mosaic dataset and published, you can include a query such as ProjectID=1234 and only see the imagery (including overviews) for the specific project.

As source mosaic datasets are generally not directly used as image services, their properties are not as important to set. The primary reason to set properties for the source mosaic datasets is to enable quality assurance checking of the mosaic datasets. Typical workflows set all the required properties to ensure suitable quality assurance.

Derived mosaic datasets

Derived mosaic datasets are created from multiple source mosaic datasets. The derived mosaic dataset typically combines multiple source mosaic datasets into a single larger collection.

Imagery is added to the derived mosaic dataset using the Table raster type. This enables all records from one or more source mosaic datasets to be added. When the Table raster type is used and the source is another mosaic dataset, the complete record—including processing and metadata attributes—is copied from the source. In some cases, only a subset of the source mosaic dataset are added to a derived mosaic dataset. For example, images with too much cloud cover may be excluded based on metadata provided in the source mosaic dataset. The spatial reference of the derived mosaic dataset is set to encompass all the imagery and may be different than the source mosaics. The number of bands and bit depth is set to be appropriate for all the data sources.

Optionally, functions can be applied to transform the data. For example, the Extract Bands function can be used to convert imagery from 4 band to 3 band, or a stretch can be applied to convert from 16 bit to 8 bit. Typically, each derived mosaic dataset will have a range of functions added to define various products. For example, a mosaic dataset that provides elevation data may have a set of functions added to provide hillshade, slope, and aspect representations.

Multiple derived mosaic datasets can use the same source mosaic datasets. For example, a derived mosaic dataset for natural color imagery and one for enabling multispectral analysis can use the same source mosaic dataset from a high-resolution satellite.

In many workflows, overviews are computed on the source mosaic datasets and are added to the derived mosaic datasets. When attributed correctly, they allow users to view collections of imagery at small scales by setting appropriate filters.

In some cases, imagery is directly added to a derived mosaic dataset, rather than being organized into a source mosaic dataset first. For example, an image source such as World Imagery or NaturalVue (available on ArcGIS Online as an image service or cached map service providing global 15-meter resolution imagery) can be added to provide a background image of natural color imagery, or an overview image from another source can be added to provide context at small scales. If no suitable overview exists for the derived mosaic dataset, overviews can be built.

Derived mosaic datasets do not need to be static, and over time, the source mosaic datasets from which they are derived may change or new source mosaic datasets may be added. To update the derived mosaic datasets, two approaches can be used. The Synchronize Mosaic Dataset tool, which checks for changes in all sources and updates any changes, can be used. Alternatively, if the process of creating the derived mosaic dataset is automated, the derived mosaic dataset can be re-created, as the process is generally quick and efficient.

The steps to create a derived mosaic dataset are similar to those of a source mosaic dataset:

Create a derived mosaic dataset using the Table raster type.
Add the source mosaic datasets.
Refine the mosaic dataset properties.
Compute pixel cell sizes.
Refine footprints and define NoData.
Generate overviews.

Update a derived mosaic dataset

Derived mosaic datasets are created using the spatial reference system, bands, and bit depth appropriate for the final service. For organizations that work on local datasets and have standardized one spatial reference system, this is typically used. For global datasets, the Web Mercator Auxiliary Sphere projection is often used. The spatial reference system of the derived mosaic dataset does not need to be the same as the source, but when the footprints of the source mosaic dataset are transformed to the derived mosaic dataset spatial reference system, the footprint will be densified if there are differences in the curvature of the projection. This densification can add a large number of vertices to a footprint, which can affect performance.

Add rasters

The Table raster type is used when creating a derived mosaic dataset. This raster type ensures that every item in the source mosaic dataset is duplicated in the derived mosaic datasets and ensures that all records and associated raster item properties are quickly accessible. The process of creating a derived mosaic dataset by this method is fast, as it is not necessary for the system to read metadata from the source imagery; instead, all the metadata and attributes are quickly copied.

Although this may result in a large number of records in the derived mosaic dataset, it is a more scalable method. An alternative is to add the mosaic dataset using the Raster Datasetraster type. This adds the source mosaic datasets as a single item. The resulting derived mosaic dataset has only one record for each source mosaic. Although this works, it does not scale well, as the system potentially needs to open and close many mosaic datasets.

There are cases in which images are directly added to a derived mosaic dataset. For example, a service may use an image, image service, or map service as a background when there is no other imagery to display. This can be done by adding the selected image or service as a raster dataset and setting the ZOrderfield to a large positive value, which puts it at a low display priority. As a result, if no other imagery is to be displayed, the added raster is displayed. Setting a negative ZOrder value causes the imagery to be displayed at a higher priority than the other images.

When adding images to the derived mosaic dataset, turn off the Update Cell Size Ranges parameter. If it's not turned off, every cell size will be recomputed, which can potentially break the ordering that is defined in each source mosaic dataset.

Cell sizes

Cell or pixel sizes are copied from the source mosaic dataset, so there is no requirement to recompute them. Don't use the Calculate Cell Size Ranges tool with the default settings. If you do, the cell sizes are recomputed based on the standard overlap rules, which is rarely required and changes the imported values (which are difficult to reset). In cases in which more rasters have been added individually, set their MinPS and MaxPS values manually.

The Calculate Cell Size Ranges tool computes both the MinPS and MaxPS cell size values for each raster item, and computes values for a levels table. This table is used to determine how to group images based on their scale ranges so that functionality such as seamline generation can correctly create lines around images of similar pixel sizes. The grouping is determined based on the mosaic dataset's Cell Size Tolerance Factorproperty. It may be necessary to set this value and run the Calculate Cell Size Ranges tool, with the Compute Minimum and Maximum Cell Sizes parameter unchecked.

Footprints, boundaries, and NoData

Typically, there is no need to refine footprints or change NoData values in the derived mosaic datasets. There are cases in which the boundary may need to be recomputed. Instead of computing the boundary when the source mosaic datasets are added, the boundary is usually computed once after all the sources are added using the Build Boundary tool. In cases in which the boundary geometry becomes unnecessarily complex, the boundary can be set to the envelope of the footprints using the Build Boundary tool with the simplification method set to Envelope.

Consider whether the imagery should be clipped by the boundary. The mosaic datasets Always Clip the mosaic dataset to its Boundary property can be set to either clip or not clip the imagery to the boundary geometry. The visible extent of the mosaic dataset is controlled by the boundary feature layer geometry, so it can be altered to hide portions of the input imagery. Typically, this is set to clip only when the boundary is to be used to restrict access to imagery outside the boundary. Otherwise, it is better not to clip to the boundary so the additional clip processing is removed.

The extent of an image service is set when the service is published, based on the boundary. This cannot be changed while the service is running. In applications in which new imagery is added to the service after it has been published, ensure that the extent (envelope) of the service is sufficient to cover all new imagery. It may be necessary to redefine the boundary of a service as a rectangle covering the complete extent of all imagery to be added. This can be done using the standard feature editing tools and modifying the boundary feature.

Overviews

In many cases, the overviews in the source mosaic datasets are used in the derived mosaic datasets. As long as suitable attributes are defined for the overviews, they can be used in some queries. For example, a derived mosaic dataset of high-resolution satellite imagery created from source mosaic datasets from different sensors may have overviews attributed as QuickBird or GeoEye1. When overviews are imported using a table raster type, the Categoryfield is set back to primary.

It may be helpful to create a separate overview from the derived mosaic dataset for use at very small scales. When a user zooms to the extent of a mosaic dataset, it is advantageous if the system only needs to read a single raster. To enable this, define and build overviews for the smallest scales. Typically, the pixel size for these overviews may be set to about 1/5000 of the width. As with creating overviews for source mosaic datasets, build these overviews after the appropriate default mosaic method has been defined.

Referenced mosaic datasets

A referenced mosaic dataset is a mosaic dataset based on a source mosaic dataset with a specific visualization or raster function applied for a specific purpose. For example, a mosaic dataset may contain images with four spectral bands, including a near infrared band. The default visualization of this mosaic dataset is to show the natural color band combination, but some users may prefer to have the false color composite view of the image layer. With a reference mosaic dataset, you can create this custom visualization of the mosaic dataset with the new band combination and publish it as a different image service and keep the original visualization as a separate image service. The source mosaic dataset is unaffected by changes to the references mosaic dataset, allowing you to create as many referenced mosaic datasets as necessary for your workflow. You can create a referenced mosaic dataset by referencing derived or source mosaic datasets. A reference mosaic dataset has its own properties and service level functions but uses the footprint table of what it references.

The reference can be defined with a query so that a reference mosaic dataset can also be a subset of the source. With a query or raster function, you can limit the visible images in the mosaic dataset, depending on the intended use of the image service. For example, if you are creating an image service for one project in which there is a defined study area, you can limit the visible images in the image service to the specific images. For example, from a derived mosaic representing elevation data for the world, you can create a referenced mosaic dataset to define a hillshade or slope map product for a selected area.

ArcGIS manages security at the service level, and one way of defining different access rights to different user groups is to create separate referenced mosaic datasets for each group.

Referenced mosaic datasets are also often created to define different restrictions. For example, downloading may be restricted in one service but enabled in another service that can be used for geoprocessing. Similarly, applying color correction is a property of the mosaic dataset and is not set by the client application. You can publish an image service with and without color correction by creating and publishing a referenced mosaic dataset.

Another use is for services that require different default properties. For example, you may need to serve two web map services, one serving natural color and the other false color. This can be done by creating a 4-band image service that defaults to natural color, with a separate reference mosaic dataset with the Extract Bands to False Color server function.

The image services maintain the properties and custom display settings set in the mosaic dataset. You can use source mosaic datasets, referenced mosaic datasets, and derived mosaic datasets for image services and imagery layers. When used in combination, one source mosaic dataset can be used to create multiple referenced or derived mosaic datasets. The possible visualizations are only limited by the properties of the source images and the intended workflows using the imagery.