COS 88-2
Exploring biodiversity data using a new multivariate statistical method

Wednesday, August 13, 2014: 1:50 PM
Beavis, Sheraton Hotel
Denis Valle, School of Forest Resources and Conservation, University of Florida, Gainesville, FL
Benjamin Baiser, Wildlife Ecology and Conservation, University of Florida, Gainesville, FL
Christopher W. Woodall, Northern Research Station, USDA Forest Service, Saint Paul, MN
Robin L. Chazdon, Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT
Background/Question/Methods

Biodiversity data are notoriously hard to interpret due to its high-dimensional nature. Multiple approaches have been proposed to reduce the dimensionality of these data, thus aiding researchers in finding patterns and trends regarding shifts in species composition in space (e.g., along environmental gradients) and time (e.g., as a function of anthropogenic disturbances). However, a common problem of several existing multivariate methods is that their outputs are also hard to interpret biologically, typically failing to match the types of conceptual models that environmental scientists have. Here we propose a novel multivariate statistical method that outputs results that are directly interpretable; more specifically, this method determines the species composition of communities and the community composition of sampling units. Furthermore, because it is based on a fully probabilistic generative model, it naturally accommodates missing data and it allows for coherent estimates of uncertainty. We compare inference provided by this model to that provided by standard multivariate methods using simulated data. We then illustrate our method using two case studies, one based on Forest Inventory Analysis (FIA) data for eastern United States and the other based on data from a secondary forest chronosequence in Costa Rica.

Results/Conclusions

Using simulated data, we find that gradual changes in species composition along a gradient can be well represented by our model, even with missing data, but not by current multivariate methods. In relation to FIA data, we find that several tree communities follow rough latitudinal bands, similar to standard depiction of forest types. Yet, our analysis further reveals that some communities, while dominant at a more restricted region, are present at a much larger spatial extent. Our method also reveals striking spatial patterns regarding within-plot heterogeneity, which indicate that some areas are not dominated by any given tree community. In relation to the Costa Rica chronosequence, we find that the proportion of trees from the old growth community increases with time since abandonment and for smaller tree size classes, indicating a convergent trend regarding species composition. However, our analysis also reveals that plot differences can sometimes larger than the effect of time since abandonment.

The proposed method is likely to provide novel insights regarding how communities change along environmental gradients and respond to anthropogenic effects. We believe this method will soon become indispensable in the toolkit of environmental scientists dealing with biodiversity data.