Selecting a Post Hoc test
Note: The editor is unsure of the source of this material. A citation would be greatly appreciated!
“Once you have determined that differences exist among the means, post hoc range tests and pairwise multiple comparisons can determine which means differ. Range tests identify homogeneous subsets of means that are not different from each other. Pairwise multiple comparisons test the difference between each pair of means, and yield a matrix where asterisks indicate significantly different group means at an alpha level of 0.05” (SPSS, Inc.).
Post Hoc tests that assume equal variance
Multiple Comparison Tests AND Range Tests | Range Tests Only | Multiple Comparison Tests Only |
---|---|---|
Tukey’s HSD (honestly significant difference) test | Tukey’s b (AKA, Tukey’s WSD (Wholly Significant Difference)) | Bonferroni (don't use with 5 groups or greater) |
Hochberg’s GT2 | S-N-K (Student-Newman-Keuls) | Sidak |
Gabriel | Duncan | Dunnett (compares a control group to the other groups without comparing the other groups to each other) |
Scheffe (confidence intervals that are fairly wide) | R-E-G-W F (Ryan-Einot-Gabriel-Welsch F test) | LSD (least significant difference) |
R-E-G-W Q (Ryan-Einot-Gabriel-Welsch range test) | ||
Waller-Duncan |
Post Hoc tests that do not assume equal variances
Tamhane’s T2
Dunnett’s T3
Games-Howell
Dunnett’s C
About the more popular Post Hoc tests
Fisher's LSD (Least Significant Different)
This test is the most liberal of all Post Hoc tests and its critical t for significance is not affected by the number of groups. This test is appropriate when you have 3 means to compare. It is not appropriate for additional means.
Bonferroni (AKA, Dunn’s Bonferroni)
This test does not require the overall ANOVA to be significant. It is appropriate when the number of comparisons (c = number of comparisons = k(k-1))/2) exceeds the number of degrees of freedom (df) between groups (df = k-1). This test is very conservative and its power quickly declines as the c increases. A good rule of thumb is that the number of comparisons (c) be no larger than the degrees of freedom (df).
Newman-Keuls
If there is more than one true null hypothesis in a set of means, this test will overestimate they familywise error rate. It is appropriate to use this test when the number of comparisons exceeds the number of degrees of freedom (df) between groups (df = k-1) and one does not wish to be as conservative as the Bonferroni.
Tukey's HSD (Honestly Significant Difference)
This test is perhaps the most popular post hoc. It reduces Type I error at the expense of Power. It is appropriate to use this test when one desires all the possible comparisons between a large set of means (6 or more means).
Tukey's b (AKA, Tukey’s WSD (Wholly Significant Difference))
This test strikes a balance between the Newman-Keuls and Tukey's more conservative HSD regarding Type I error and Power. Tukey's b is appropriate to use when one is making more than k-1 comparisons, yet fewer than (k(k-1))/2 comparisons, and needs more control of Type I error than Newman-Kuels.
Scheffe
This test is the most conservative of all post hoc tests. Compared to Tukey's HSD, Scheffe has less Power when making pairwise (simple) comparisons, but more Power when making complex comparisons. It is appropriate to use Scheffe test only when making many post hoc complex comparisons (e.g. more than k-1).