Nonparametric Bayesian functional equivalence models for community data
Frequently, large community datasets are sparse with a large number of species. Reducing the number of species used in a community analysis can make model estimation and interpretation much easier. However, choosing this reduction is non-trivial. Important species can be left out of the analysis or combined inappropriately. We introduce some nonparametric Bayesian models which simultaneously learn about the groups of functionally equivalent species and the corresponding parameter values for each group when describing an ecosystem function. The model also allows for multiple membership, where there is a probability associated with each species sharing the same functional group as another.
We illustrate these models on communities of methane consuming soil bacteria collected across the North American Great Plains as a part of the Great Plains Methane project. For describing soil methane flux, the maximum a-posteriori (MAP) estimate for the number of groups of functionally equivalent operational taxonomic units (microbial analog to species), or OTUs, is 2. The total number of OTUs is 33. From a heat map we show 4 clades are well associated with one group and that there is a set of 5 OTUs poorly associated with any group.
This model is well suited for a variety of community data-types. It also has the added benefit of potentially selecting a single functional group. This is particularly useful for relative abundance community data where the collapsed community becomes the intercept of the model.