Quantile-quantile (QQ) plots are an exploratory tool used to assess the similarity between the distribution of one numeric variable and a normal distribution, or between the distributions of two numeric variables.
There are two types of QQ plots, normal QQ plots and general QQ plots.
- Normal QQ plots are constructed by plotting the quantiles of a numeric variable against the quantiles of a normal distribution.
- General QQ plots plot the quantiles of one numeric variable against the quantiles of a second numeric variable.
If the distributions of the compared quantiles are identical, the plotted points will form a straight 45-degree line. The farther the plotted points deviate from a straight line, the less similar the compared distributions.
Variables
Normal QQ plots require one numeric variable which will be plotted against a normal distribution. General QQ plots require two numeric variables which will be plotted against each other.
Transformation
Some analytical methods require that data be normally distributed. When the data is skewed (the distribution is lopsided), you might want to transform the data to make it normal. Normal QQ plots allow you to explore the effects of logarithmic and square root transformations on the distribution of your data while comparing them to a normal distribution.
Logarithmic transformation
The logarithmic transformation is often used where the data has a positively skewed distribution and there are a few very large values. If these large values are located in your dataset, the log transformation will help make the variances more constant and normalize your data.
Note:
Logarithmic transformations can only be applied if all of the variable's values are greater than zero. Any values of zero will result in an error.
Square root transformation
A square root transformation is similar to a logarithmic transformation in that it reduces right skewness of a dataset. Unlike logarithmic transformations, square root transformations can be applied to zero.
Note:
Square root transformations can only be applied if all the variable's values are greater than or equal to zero. Any negative values will result in an error.
Axes
Axis bounds
Default minimum and maximum axis bounds are set based on the range of data values represented on the axis. These values can be customized by typing in a new desired axis bound value. Clicking the reset icon will revert the axis bound back to the default value.
Number format
You can format the way an axis will display numeric values by specifying a number format category or by defining a custom format string.
Appearance
Titles and description
Charts and axes are given default titles based on the variable names and chart type. These can be edited on the General tab in the Chart Properties pane. You can also provide a chart Description, which is a block of text that appears at the bottom of the chart window.
Color
QQ plots inherit their outline and fill colors from the source layer symbology. By symbolizing a layer with a different attribute than either of the QQ plot variables, a third variable can be shown on the QQ plot visualization.
Guides
Guide lines or ranges can be added to charts as a reference or way to highlight significant values. To add a new guide, on the Guides tab in the Chart Properties pane, click Add guide. To draw a line, enter a Value where you would like the line to draw. To create a range, enter a to value. You can optionally add text to your guide by specifying a Label.
Example
Create a QQ plot to evaluate if particulate matter samples in California are normally distributed.
- Compare the distribution of— Particulate Matter
- With transformation—None
- To— <Normal Distribution>