Scatter plots visualize the relationship between two numeric variables, where one variable is displayed on the x-axis, and the other variable is displayed on the y-axis. For each record, a point is plotted where the two variables intersect in the chart. When the resulting points form a nonrandom structure, a relationship exists between the two variables.
Multiple series
Scatter plots can be displayed with multiple series by setting a Split by category field. For example, in a dataset of crime incidents, a CrimeType field can be used to split the data into multiple series. The Series table will populate with each unique crime type (Theft, Vandalism, Arson), and the resulting chart will display three scatter plot series.
Display multiple series
To configure a scatter plot with multiple series, use the Display multiple series as option under the Series tab in the Chart Properties pane. By default, multiple series are displayed with the Single chart option. In this representation, all series are drawn in the same plot area, but each series is assigned a unique color to allow comparisons between the different groups.
You can also view a scatter plot with multiple series as a grid chart (also known as small multiples) by selecting the Grid option. This option displays a matrix of smaller charts, where each mini chart only shows data for an individual series. Grid charts are helpful for comparing trends and patterns between different subgroups in your data. You can customize the dimensions of a grid chart layout by setting the Mini charts per row numeric input. For instance, setting Mini charts per row chart to 3 will display a maximum of three charts per row—the total number of rows in the grid will be determined by the number of series in your chart. Checking the Show preview chart check box allows you to dynamically explore each mini chart in greater detail by selecting one to view in the larger preview chart.
Variables
Scatter plots are made up of two numbers, one for the x-axis and one for the y-axis. Additionally, a third numeric variable can be specified to proportionally size each point in the plot.
Statistics
A regression equation is calculated and the associated trend line and R² value are plotted on scatter plots. The trend line models the linear relationship between x and y, and the R² value quantifies how well the data fits the model. This is only relevant for linear relationships. To turn off the trend line, uncheck the Show linear trend check box in the Chart Properties pane, or turn visibility on and off by clicking the item in the legend. To change the color of the trend line, click the trend line color swatch in the Chart Properties pane and choose a new color.
Correlation
When small x-values correspond to small y-values, and large x-values correspond to large y-values (line sloping up), this indicates a positive correlation. When small x-values correspond to large y-values, and large x-values correspond to small y-values (line sloping down), this indicates a negative correlation.
Note:
A correlation between x and y does not imply that x causes y.
Symbol
Several options control the chart symbolization and related settings.
Size
Scatter plot points can be uniform in size or sized proportionally by a numeric attribute. Sizing scatter plot points proportionally based on a third numeric variable adds another dimension to the visualization, creating a bubble plot.
Color
Scatter plot points can be visualized using a single color or with the colors specified in the layer's symbology. By default, scatter plots use layer colors and inherit their outline and fill colors from the source layer symbology. By symbolizing a layer with a different attribute than either of the scatter plot variables, an additional dimension can be shown on the scatter plot visualization.
Axes
Several options control the axes and related settings.
Axis bounds
Default minimum and maximum axis bounds are set based on the range of data values represented on the axis. These values can be customized by typing a new desired axis bound value. Clicking the reset button reverts the axis bound to the default value.
Log axis
By default, scatter plot axes are displayed on a linear scale. One or both axes can be displayed on a logarithmic scale by checking the Log axis check box in the Axes section of the Chart Properties pane.
Logarithmic scales are useful when visualizing data with large positive skew, where the majority of data points have a small value, with a few data points with very large values. Changing the scale of the axis does not change the value of the data, only the way it is displayed.
Linear scales are based on addition, and logarithmic scales are based on multiplication.
On a linear scale, each increment on the axis represents the same distance in value. For example, in the axis diagram below, each increment on the axis increases by adding 10.
On a logarithmic scale, increments increase by magnitudes. In the axis diagram below, each increment on the axis increases by multiplying by 10.
Note:
Logarithmic scales cannot display negative values or zero. If you log the axis of a variable with negative values or zero, those values will not appear on the chart.
Adaptive axis bounds
When a multiseries scatter plot is displayed with the Grid option, the axis bounds can be configured with the following options:
- Fixed—Applies the global minimum and maximum bounds to all mini charts.
- Adaptive—Adjusts to the local minimum and maximum bounds for each mini chart.
Number format
You can format the way an axis will display numeric values by specifying a number format category or by defining a custom format string. For example, $#,### can be used as a custom format string to display currency values.
Appearance
Several options control the chart appearance and related settings.
Titles and description
Charts and axes are given default titles based on the variable names and chart type. These can be edited on the General tab in the Chart Properties pane. You can also provide a chart Description, which is a block of text that appears at the bottom of the chart window.
Guides
Guide lines or ranges can be added to charts as a reference or way to highlight significant values. To add a new guide, browse to the Guides tab in the Chart Properties pane, choose whether you want to draw a vertical or horizontal guide, and click Add guide. To draw a line, enter a Value where you want the line to draw. To create a range, enter a to value. You can optionally add text to your guide by specifying a Label.
Example
Create a scatter plot to visualize the relationship between diabetes and hypertension among Medicare beneficiaries. Select features in the chart to see where they fall on the map.
- X-Axis—Diabetes rate
- Y-Axis—Hypertension rate