Quantile-quantile (QQ) plots are an exploratory tool used to assess the similarity between the distribution of one numeric variable and a normal distribution, or between the distributions of two numeric variables.
There are two types of QQ plots, normal QQ plots and general QQ plots.
- Normal QQ plots are constructed by plotting the quantiles of a numeric variable against the quantiles of a normal distribution.
- General QQ plots plot the quantiles of one numeric variable against the quantiles of a second numeric variable.
If the distributions of the compared quantiles are identical, the plotted points will form a straight 45-degree line. The farther the plotted points deviate from a straight line, the less similar the compared distributions.
Normal QQ plots require one numeric variable which will be plotted against a normal distribution. General QQ plots require two numeric variables which will be plotted against each other.
Some analytical methods require that data be normally distributed. When the data is skewed (the distribution is lopsided), you might want to transform the data to make it normal. Normal QQ plots allow you to explore the effects of logarithmic and square root transformations on the distribution of your data while comparing them to a normal distribution.
The logarithmic transformation is often used where the data has a positively skewed distribution and there are a few very large values. If these large values are located in your dataset, the log transformation will help make the variances more constant and normalize your data.
Logarithmic transformations can only be applied if all of the variable's values are greater than zero. Any values of zero will result in an error.
Square root transformation
A square root transformation is similar to a logarithmic transformation in that it reduces right skewness of a dataset. Unlike logarithmic transformations, square root transformations can be applied to zero.
Square root transformations can only be applied if all the variable's values are greater than or equal to zero. Any negative values will result in an error.
Default minimum and maximum axis bounds are set based on the range of data values represented on the axis. These values can be customized by typing in a new desired axis bound value. Clicking the reset icon will revert the axis bound back to the default value.
You can format the way an axis will display numeric values by specifying a number format category or by defining a custom format string.
Titles and description
Charts and axes are given default titles based on the variable names and chart type. These can be edited on the General tab in the Chart Properties pane. You can also provide a chart Description, which is a block of text that appears at the bottom of the chart window.
When a chart window is active, a Chart Format context ribbon becomes available, allowing visual formatting of the chart. Chart formatting options include the following:
- The size, color, and style of the font used for axis titles, axis labels, description text, and legend text.
- The color, width, and line type for grid and axis lines.
- The background color of the chart.
QQ plots inherit their outline and fill colors from the source layer symbology. By symbolizing a layer with a different attribute than either of the QQ plot variables, a third variable can be shown on the QQ plot visualization.
Create a QQ plot to evaluate if particulate matter samples in California are normally distributed.
- Compare the distribution of— Particulate Matter
- With transformation—None
- To— <Normal Distribution>