Friday, August 6, 2010 - 11:10 AM

COS 117-10: Selecting ordinations with information criteria

Steven C. Walker, University of Toronto and Donald A. Jackson, University of Toronto.

Background/Question/Methods

The interrelatedness of ecological phenomena often forces us to consider many variables simultaneously. But multivariate data are difficult to visualize and therefore difficult to understand and learn from. Ecologists use ordinations to reduce the dimensionality of multivariate data by exploiting associations between variables. Ordination analysis is an indispensable tool for visualizing otherwise impossibly complex data, but ordination diagrams can also mislead us. Problems arise because ordinations are not plots of the raw data themselves but rather a summary of them, and so it is always possible that important information is left out of the summary. Different ordination procedures differ in the types of patterns that they summarize.  It is therefore important to select an ordination procedure appropriate for the data and question at hand. Here we embed ordination-based summaries within statistical models that relate the summaries to the raw data. We then use information criteria, similar to Akaike's AIC, to select among ordinations. This model-based approach not only yields a valuable tool for selecting appropriate ordinations, but also clarifies the relationship between the summarizing ordination and the data being summarized. We studied our proposed methodology from philosophical, operational, theoretical and (simulation) experimental perspectives.  

Results/Conclusions

Ordinations are very abstract objects. Along these lines we will argue that the concept of a 'true' ordination is difficult to justify. One of the philosophical arguments in favour of AIC is that it is appropriate whenever all candidate models are untrue--although some may be useful. Hence AIC-like information criteria might hold certain philosophical advantages over current approaches to ordination selection. We will illustrate the utility of our proposed methodology--with real multivariate data from lake ecosystems--by using it to address two common ordination-selection problems. These examples will demonstrate an attractive operational simplicity to our methodology: generate a candidate set of ordination models and rank the candidates with information criteria. We will prove that our information criterion is an asymptotically unbiased estimate of expected Kullback-Leibler information, which is how Akaike motivated his criterion. We also assessed this bias in more realistically sized samples with a simulation experiment. Our theoretical result on bias held almost exactly for these experiments and type I error rates were all less than 0.05. We conclude that our methodology provides a useful alternative to ordination selection, and expands the range of ecological models for which information criteria can be applied.