COS 145-4 - Interpretable, accurate predictions of species distributions and community composition: Making the most of prior information

Thursday, August 9, 2012: 9:00 AM
C120, Oregon Convention Center
David J. Harris, Population Biology, UC Davis, Davis, CA
Background/Question/Methods

Species Distribution Models, i.e. statistical models of where species live, can help inform many areas of ecology, from niche theory to conservation planning. However, most techniques "know" very little about ecology, which limits their power and utility for addressing ecological questions. For example, most methods cannot use information about a species' distribution to inform their predictions about its congeners or members of the same guild, and must "start from scratch" for each new species.

By avoiding this redundancy, community ("multiresponse") models can sometimes outperform single-species models that have orders-of-magnitude more parameters, but there remains much room for improvement. In this talk, I focus on one of the more successful community models, known as MARS-COMM or mars.glm, which works by identifying important environmental gradients and ecotones that affect many species in the community simultaneously. While this approach sometimes has state-of-the-art performance, it can also overfit badly in some cases. In this talk, I show how weak assumptions about the strength of environmental gradients can avoid this problem, resulting in a model that can accurately predict species' locations and identify important biogeographic features that affect community composition.

Results/Conclusions

Using presence-absence data for North American birds, I show that this better-informed approach outperforms both MARS-COMM and Boosted Regression Trees, a powerful single-species method. It is also highly interpretable, with the entire model being reducible to a small number of line graphs that clearly show species' responses to different environmental factors. The method also produces confidence intervals for both the model's structure and its predictions, which allows for statistical evaluation of ecological hypotheses that could not be addressed with other approaches.