Ecologists often face a quandary when choosing how to analyze their data. Should I transform data and use a traditional analysis, or use a modern computationally intensive analysis? The traditional analysis is clearly an approximation; what is not commonly understood is that many modern analyses are also approximate. Hence, the choice of analysis is not clear. I consider various ways to analyze weed counts in a large multi-location evaluation of weed management strategies. Farmer’s fields, the main plots, were classified into one of seven crop-rotation treatments, then divided in half. One of two weed management strategies was randomly assigned to the half field. We consider the analysis of the number of weeds present at a specific point in time. The weed counts are overdispersed relative to a Poisson distribution.
The traditional analysis is to log transform the counts then assume a normal distributions for the random effects. This leads to the standard analysis of a split-plot study. A modern alternative is use a generalized linear mixed model assuming an overdispersed Poisson distribution and an additional random component to account for field-field variation. The GLMM can be specified in different ways.
Results/Conclusions
We find that the split-plot analysis of transformed data has empirical rejection rates are close to the nominal 5% rate for all tests. The most common analysis of overdispersed Poisson gives conservative tests of the main plot effects (empirical rejection rate < 5%) and very liberal tests of the split plot main and interaction effects. When the overdispersion is moderate (variance associated with overdispersion > 1), the empirical rejection rate can exceed 60%. for a nominal 5% test. An alternate specification of the model has an acceptable type I error rate.
Modern analyses are not always better than traditional analyses. For overdispered split-plot count data, traditional analyses are similar to some modern analyses. Other modern analyses are badly behaved and should not be used.