Monday, August 2, 2010 - 4:20 PM

SYMP 1 -9: Statistics needed to study global change: Which ones to use?

Marie-Josee Fortin, University of Toronto and Josie Hughes, University of Toronto.

Background/Question/Methods

Global change studies have to deal with data that are the results of many confounding processes which act over different spatio-temporal scales.  In addition to characterizing potentially interesting patterns, ecologists seek to determine the relative importance of various processes in causing the observed patterns.  The causal processes that generated these patterns can be either: (1) additive where the resulting structure is the sum of the patterns generated by each process; or (2) not additive where processes interact among themselves creating complex data structure.  In both cases, OLS regression cannot be used as the data are not independent.  Given such different data types, the question is therefore which statistical method to use? 

The answer depends on the goals of analysis and the data.  A range of multiscale approaches (wavelets, eigenfunction spatial analyses, nested kriging) can be useful for characterizing patterns at a range of spatial scales.  A posteriori procedures are required to determine which spatial scales reflect which process(es).  When the goal of analysis is to infer process, regression is a common approach.  Key challenges are to select adequate model forms, and to adequately consider potentially confounding factors and alternative explanations.  All regression models are premised on assumptions about the distribution of data, relationships between observations, and relationships with predictor variables (e.g. linear, additive, stationary).  A huge variety of regression modeling options are available to allow for variation in these assumptions: alternative distributions (GLM, GLMM, ZIP, Bayesian techniques); spatial and/or temporal autocorrelation (autocovariate regression, spatial eigenvector mapping, regression kriging, mixed and hierarchical models, GEE); non-linear relationships (GAM, regression trees, parametric non-linear modeling); non-stationarity (GWR, delineation of sub-regions); and observation error (state-space).  

Results/Conclusions

Here, we propose a series of analytical steps to help clarify which methods are most appropriate in a particular situation: (1) understand the assumptions of available methods; (2) assess whether assumptions are reasonable; (3) add complexity only when necessary; and (4) compare alternative possibilities.  We note that some key challenges remain before the full range of modeling approaches can be widely adopted by ecologists.  Some models are difficult or impossible to fit using available software and numerical methods.  Even when computer codes are available, informative literature may be missing.  Nonetheless, a set of good practices can help ecologists navigate through the options.