Overview of georeferencing

Raster data is obtained from many sources, such as satellite images, aerial cameras, and scanned maps. Modern satellite images and aerial cameras tend to have relatively accurate location information, but might need slight adjustments to line up all your GIS data. Scanned maps and historical data usually do not contain spatial reference information. In these cases you will need to use accurate location data to align or georeference your raster data to a map coordinate system. A map coordinate system is defined using a map projection-a method by which the curved surface of the earth is portrayed on a flat surface.

When you georeference your raster data, you define its location using map coordinates and assign the coordinate system of the map frame. Georeferencing raster data allows it to be viewed, queried, and analyzed with your other geographic data. The georeferencing tools on the Georeference tab allows you to georeference any raster dataset.

In general, there are four steps to georeference your data:

  1. Add the raster dataset that you want to align with your projected data.
  2. Use the Georeference tab to create control points, to connect your raster to known positions in the map
  3. Review the control points and the errors
  4. Save the georeferencing result, when you are satisfied with the alignment.

Aligning the raster with control points

Generally you will georeference your raster data using existing spatial data (target data), such as georeferenced rasters or a vector feature class that resides in the desired map coordinate system. The process involves identifying a series of ground control points—known x,y coordinates—that link locations on the raster dataset with locations in the spatially referenced data. Control points are locations that can be accurately identified on the raster dataset and in real-world coordinates. Many different types of features can be used as identifiable locations, such as road or stream intersections, the mouth of a stream, rock outcrops, the end of a jetty of land, the corner of an established field, street corners, or the intersection of two hedgerows.

The control points are used in conjunction with the transformation to shift and warp the raster dataset from its existing location to the spatially correct location. The connection between one control point on the raster dataset (the from point) and the corresponding control point on the aligned target data (the to point) is a control point pair.

The number of links you need to create depends on the complexity of the transformation you plan to use to transform the raster dataset to map coordinates. However, adding more links will not necessarily yield a better registration. If possible, you should spread the links over the entire raster dataset rather than concentrating them in one area. Typically, having at least one link near each corner of the raster dataset and a few throughout the interior produces the best results.

Generally, the greater the overlap between the raster dataset and target data, the better the alignment results, because you'll have more widely spaced points with which to georeference the raster dataset. For example, if your target data only occupies one-quarter of the area of your raster dataset, the points you could use to align the raster dataset would be confined to that area of overlap. Thus, the areas outside the overlap area are not likely to be properly aligned. Keep in mind that your georeferenced data is only as accurate as the data to which it is aligned. To minimize errors, you should georeference to data that is at the highest resolution and largest scale for your needs.

Transforming the raster

When you've created enough control points, you can transform the raster dataset to the map coordinates of the target data. You have the choice of using several types of transformations, such as polynomial, spline, adjust, projective, or similarity, to determine the correct map coordinate location for each cell in the raster.

The polynomial transformation uses a polynomial built on control points and a least-squares fitting (LSF) algorithm. It is optimized for global accuracy but does not guarantee local accuracy. The polynomial transformation yields two formulas: one for computing the output x-coordinate for an input (x,y) location and one for computing the y-coordinate for an input (x,y) location. The goal of the least-squares fitting algorithm is to derive a general formula that can be applied to all points, usually at the expense of slight movement of the to positions of the control points. The number of the noncorrelated control points required for this method must be 1 for a zero-order shift, 3 for a first order affine, 6 for a second order, and 10 for a third order. The lower order polynomials tend to give a random type error, while the higher order polynomials tend to give an extrapolation error.

A zero-order polynomial is used to shift your data. This is commonly used when your data is already georeferenced, but a small shift will better line up your data. Only one control point is required to perform a zero-order polynomial shift. It may be a good idea to create a few control points, then choose the one that looks the most accurate.

The first-order polynomial transformation is commonly used to georeference an image. Use a first-order or affine transformation to shift, scale, and rotate a raster dataset. This generally results in straight lines on the raster dataset mapped as straight lines in the warped raster dataset. Thus, squares and rectangles on the raster dataset are commonly changed into parallelograms of arbitrary scaling and angle orientation. Below is the equation to transform a raster dataset using the affine (first order) polynomial transformation. You can see how six parameters define how a raster's rows and columns transform into map coordinates.

Cell unit to coordinate affine transformation

With a minimum of three control points, the mathematical equation used with a first-order transformation can exactly map each raster point to the target location. Any more than three control points introduces errors, or residuals, that are distributed throughout all the control points. However, you should add more than three control points, because if one control is inaccurate, it has a much greater impact on the transformation. Thus, even though the mathematical transformation error may increase as you create more links, the overall accuracy of the transformation will increase as well.

The higher the transformation order, the more complex the distortion that can be corrected. However, transformations higher than third order are rarely needed. Higher-order transformations require more links and, thus, will involve progressively more processing time. In general, if your raster dataset needs to be stretched, scaled, and rotated, use a first-order transformation. If, however, the raster dataset must be bent or curved, use a second- or third-order transformation.

Polynomial transformations

The adjust transformation optimizes for both global LSF and local accuracy. It is built on an algorithm that combines a polynomial transformation and triangulated irregular network (TIN) interpolation techniques. The adjust transformation performs a polynomial transformation using two sets of control points and adjusts the control points locally to better match the target control points using a TIN interpolation technique. Adjust requires a minimum of three control points.

The similarity transformation is a first order transformation which tries to preserve the shape of the original raster. The RMS error tends to be higher than other polynomial transformations since the preservation of shape is more important than the best fit. Similarity requires a minimum of three control points.

The projective transformation can warp lines so that they remain straight. In doing so, lines which were once parallel may no longer remain parallel. The projective transformation is especially useful for oblique imagery, scanned maps, and for some imagery products such as Landsat and Digital Globe. A minimum of four links are required to perform a projective transformation. When only four links are used, the RMS error will be zero. When more points are used, the RMS error will be slightly above zero. Projective requires a minimum of four control points.

The spline transformation is a true rubber sheeting method and optimizes for local accuracy but not global accuracy. It is based on a spline function, a piecewise polynomial that maintains continuity and smoothness between adjacent polynomials. Spline transforms the source control points exactly to target control points; the pixels that are a distance from the control points are not guaranteed to be accurate. This transformation is useful when the control points are important, and it is required that they be registered precisely. Adding more control points can increase overall accuracy of the spline transformation. Spline requires a minimum of 10 control points.

Interpret the root mean square error

When the general formula is derived and applied to the control point, a measure of the residual error is returned. The error is the difference between where the from point ended up as opposed to the actual location that was specified. The total error is computed by taking the root mean square (RMS) sum of all the residuals to compute the RMS error. This value describes how consistent the transformation is between the different control points. When the error is particularly large, you can remove and add control points to adjust the error.

Although the RMS error is a good assessment of the transformation's accuracy, don't confuse a low RMS error with an accurate registration. For example, the transformation may still contain significant errors due to a poorly entered control point. The more control points of equal quality used, the more accurately the polynomial can convert the input data to output coordinates. Typically, the adjust and spline transformations give an RMS of nearly zero; however, this does not mean that the image will be perfectly georeferenced.

The forward residual shows you the error in the same units as the data frame spatial reference. The inverse residual shows you the error in the pixels units. The forward-inverse residual is a measure of how close your accuracy is, measured in pixels. All residuals closer to zero are considered more accurate.

Persist the georeferencing information

You can permanently transform your raster dataset after georeferencing it by using the Save to New command on the Georeference tab or by using the Warp tool. You can also store the transformation information in the auxiliary files using the Save command on the Georeference tab.

Save to New or the Warp geoprocessing tool will create a new raster dataset that is georeferenced using the map coordinates and the spatial reference. ArcGIS doesn't require you to permanently transform your raster dataset to display it with other spatial data; however, you should do so if you plan to perform analysis with it or want to use it with another software package that doesn't recognize the external georeferencing information created in the world file.

Saving the georeferencing will store the transformation information in external files-it will not create a new raster dataset, which happens when you permanently transform your raster dataset. For a raster dataset that is file based, such as a TIFF, the transformation will generally be stored in an external XML file that has an .aux.xml extension. If the raster dataset is a raw image, such as BMP, and the transformation is affine, it will be written to a world file. For a raster dataset in a geodatabase, Save will store the geodata transformation to an internal auxiliary file of the raster dataset.

Related topics