Figure 1. Specific capacities of wells in four rock types
Figure 2. Natural logarithms of specific capacities of wells in four rock types
As can be seen, the boxplots of logarithms appear of about the same heights (same variability of data) and similar to a normal distribution – top and bottom portions of the boxes are about the same size, with few outliers. Boxplots of the specific capacities themselves in Figure 1 do not have these characteristics, and these data do not appear to follow normal distributions. Common parametric tests such as analysis of variance (ANOVA) require data to follow a normal distribution and each group have the same variance. Otherwise, the tests have low power – low ability to see differences that are present.
ANOVA tests differences between group means. On the original Figure 1 data the ANOVA p-value is 0.08, so group means would not be considered different. Is the non-normality causing a loss of power, pushing up the p-value, even with 50 observations in each group? ANOVA on the logarithms of Figure 2 gives a p-value of 0.007. However, this test is not a test of differences in the mean specific capacity! It tests whether the mean of the logarithms differs between groups. The mean of the logarithms is called the geometric mean when retransformed back to original units.
The geometric mean is one way to estimate the median, not the mean, of the data in original units. By computing the test in log units, we are testing the difference between geometric means -- testing the difference in medians of the groups rather than their means. A Kruskal-Wallis (nonparametric) test of group medians has a similar p-value of 0.009, another indication that medians are being tested by the ANOVA on logs.
If you transform data you must transform your idea of what is being tested with a parametric test. Means are ‘unit-specific’, and whether performing hypothesis tests, regression, or confidence intervals, what is being targeted changes once logarithms are used. Often what we actually want is a test of medians (“is one group different than the others?”). But if we specifically want to test means, transformations destroy that. There are newer methods than analysis of variance to test differences in means without assuming a normal distribution. These are called permutation tests. The permutation p-value (using the untransformed data) is 0.04 irrespective of the data’s shape, stating that group means do differ for these data.
The difference in p-value between the permutation (0.04) and classical ANOVA (0.08) tests in original units is the loss of power for classical ANOVA. As here, a better test can see something the older tests cannot. If you'd like to learn more about permutation tests, we offer both webinars and in-person courses on how they work and how they can help your data analysis come into the 21st century. See http://practicalstats.com/training for more information.