Thursday, January 08, 2015

qq plot interpretation

SAS doc gives the references and a few simple rules, and R community gives more examples around these rules. I think a better way to remember these associations is to understand the mechanism. In this way, wikipedia is doing a better job:
A simple case is where one has two data sets of the same size. In that case, to make the Q–Q plot, one orders each set in increasing order, then pairs off and plots the corresponding values
So it is almost a scatter plot of observed value vs simulated value from a distribution. Still it is not always easy to interpret, especially from a small dataset

Besides, the qqPlot function from r package 'car' includes CIs for observed quantiles by default, or more specialized qqplot for log10(p-values) from pQQ function out of Haplin