Wednesday, August 4, 2010 - 2:50 PM

OOS 31-5: Using phylogenetic β diversity measures to understand the factors structuring microbial diversity over large and fine scales

Catherine Lozupone and Rob Knight. University of Colorado

Background/Question/Methods   UniFrac is a phylogenetic β diversity measure that has been widely applied by microbial biologists to understand systems relevant to disease, applied environmental biology, and basic ecology. Microbial ecologists have been particularly open to this class of diversity measure, because species delineations are particularly troublesome for microbes, and because communities are most often evaluated based on 16S ribosomal RNA genes sequenced directly from the environment, and so the phylogenetic relationships between the source organisms can easily be estimated. UniFrac has a user-friendly web interface (http://bmf2.colorado.edu/fastunifrac), that implements monte-carlo based significance testing, hierarchical clustering of samples coupled with resampling techniques to determine the robustness of cluster nodes, and principal coordinates analyses whose resulting plots can be visualized in 3-d and dynamically colored based on sample metadata. Recent advances in sequencing technology have produced exciting opportunities to understand microbial diversity on a deeper level, but also challenges to interpret datasets generated on a whole new scale. We have addressed these challenges by developing an implementation of UniFrac that can handle very large trees, and also by testing the robustness of the results to the approximate methods of phylogenetic tree estimation that are required as sample sizes grow. I will illustrate the utility of UniFrac for determining the main factors that structure microbial diversity over large and fine scales.

Results/Conclusions   The application of UniFrac to almost 100,000 sequences compiled from 181 studies of diverse microbial assemblages deposited in GenBank, has revealed that the bacteria that inhabit the vertebrate gut are particularly distinct from free-living communities, and that the distribution of bacterial diversity in free-living assemblages is largely governed by salinity and substrate type (i.e. whether the samples were from soil/sediment or water). The application of UniFrac in controlled “within habitat” studies has explained finer scale patterns of variation, such as that soil bacteria are particularly sensitive to pH and that the lineages of bacteria found in the human and mouse intestines are most similar among members of the same family. With phylogenetic information becoming increasingly available for many taxonomic groups, UniFrac is becoming more and more broadly applicable for studying diversity in many ecological systems.