When and how should we adjust for multiple comparisons in hierarchical Bayesian models?

Ogle, Kiona; Ogle, Kiona

Background/Question/Methods

Hierarchical Bayesian (HB) analysis of ecological data has become increasingly popular in recent decades. Ecologists often use HB models to infer about multiple comparisons, but we lack a clear procedure to address multiplicity. For example, we often estimate group-level parameters that vary by species, experimental treatment level, habitat type, etc. We would conclude a non-zero pair-wise difference, separately, for each pair in the group, when the respective 95% credible interval does not contain zero. Following classical procedures, we might adjust (e.g., Bonferroni) our rejection procedure to control the family-wise error rate—the family of all pair-wise differences in our example. However, recent papers propose that such adjustments may be unnecessary in HB models due to the effects of partial pooling. Intuition suggests that rejection rates (i.e., Type I error or power) will vary inversely with increased pooling, which causes group-level parameters to become more alike. To test this intuition, we conducted a simulation experiment with factors of sample size, group size, balance, and ratio of within group variance to between group variance, resulting in a total of 256 factor level combinations. We present our results using our index of partial pooling (PPI=0=no pooling, PPI=1=complete pooling).

Results/Conclusions

Results confirm intuition that rejection rates increase (non-linearly) with PPI. Power (true rejection rate) was solely explained by PPI (R² = 0.98), whereas the Type I error rate (false rejection rate) was explained by both PPI and the degree of imbalance (R² = 0.95). Datasets suffering from severe imbalanced, or severe missingness for certain group levels, will likely result in high family-wise Type I error rates, greatly exceeding the nominal 5% comparison-wise level, especially if PPI < 0.4. Conversely, balanced sample sizes generally led to Type I rates <5%, regardless of PPI. These results indicate that HB models with balanced designs should yield Type I errors that do not require adjustments, independent of the number of comparisons. Conversely, HB models under imbalance or notable missingness require adjustments to comparison-wise rejection criteria to control family-wise error rates. We propose a new adjustment procedure that accounts for pooling (via PPI) and degree of imbalance.

Meeting Information

Additional Information

SYMP 9-5 - When and how should we adjust for multiple comparisons in hierarchical Bayesian models?