OOS 52-9
Mechanism-driven statistical models of avian community assembly
A major goal of community ecology is to understand the processes (such as environmental filtering and species interactions) that determine where species can occur and which pairs of species tend to co-occur. Accounting for these processes at the scale of whole assemblages with thousands of species is a difficult task, for at least two reasons. First, species can respond to environmental filters that are not described in our data sets, resulting in systematic deviations from ecologists' expectations. Second, ecologists need to sift through tens of thousands of species pairs to identify the pairs of species that interact most strongly. Here, I present a new modeling framework that addresses both of these problems using a combination of machine learning techniques and mechanism-driven statistical modeling. The framework proposes latent random variables to account for unobserved environmental variation and regularized Markov random fields to account for direct interactions between pairs of species. These two innovations allow ecologists to make substantially better predictions about species composition than would be possible with traditional species distribution models. They also allow ecologists to address "inverse prediction" problems, making predictions about the environmental factors that led to a given species assemblage (rather than making predictions about assemblages from the environment). Including mechanistic components in these models opens up new research avenues into how species' traits and evolutionary history shape species' responses to the environment and one another.
Results/Conclusions
I implemented this modeling framework in an R package called mistnet and evaluated its performance against several statistical and machine learning alternatives, using data from the North American Breeding Bird Survey. Jointly modeling all the species in the assemblage substantially improved predictive accuracy for most species (especially rare species, which will often be of the largest conservation concern). More importantly, predictive accuracy at the assemblage level improved by several orders of magnitude relative to combinations of separate single-species models. The major driver of this improvement was mistnet's ability to identify unmeasured sources of environmental heterogeneity that were causing species turnover, especially sources associated with land cover. The model also found substantial phylogenetic inertia in some clades (such as the waterfowl), which further improved its ability to make predictions about these species' environmental tolerances based on the habitat associations of related taxa. It also found strong evidence of pairwise associations between species, which could indicate competitive interactions or facilitation.
Slides will be made available from http://dx.doi.org/10.6084/m9.figshare.946309