In the past decade our capacity to assess the taxonomic, functional, and genetic dimensions of microbial biodiversity has exploded with the increasing availability of molecular data. Access to techniques such as environmental microarrays, metagenomics, transcriptomics, and proteomics has expanded the breadth of datasets from primarily taxonomic markers (e.g. 16S, 18S, ITS) to include information about the genetic and functional dimensions of biodiversity. As these methods become widely available and less expensive, massive amounts of data are being collected with far deeper sampling. We address how these new data can be used to quantitatively compare biodiversity patterns across the domains of life, between ecosystems, and among the taxonomic, genetic and functional dimensions of biodiversity.
The meta-analyses we present use diverse datasets previously generated by the authors and include bacterial, archaeal, fungal, and viral molecular data from a variety of ecosystems. Data were collected at various temporal and spatial scales from the terrestrial subsurface, cyanobacterial mats, a hypersaline lake, acid mine drainage, and surface soil using a range of molecular methods and study designs. We analyzed and compared the microbial diversity in these heterogeneous samples using a combination of traditional diversity indices, phylogenetic indices, diversity profiles and gene networks.
Results/Conclusions
We have found that the use of metrics that incorporate multiple types of information (similarity, evolutionary history, abundance) are best suited for comparing disparate data and providing meaningful ecological information independent of the goals of particular studies. For example, incorporating similarity (e.g. molecular homology) to build gene networks and diversity profiles avoids the issue of which similarity threshold should be used to delineate species or other taxonomic levels, and facilitates the incorporation of evolutionary history into analyses. These methods have also provided a more informative view of how community composition and diversity changes with concurrent environmental changes or disturbance. Analyzing the topology of gene networks allows us to compare different dimensions of biodiversity to better elucidate the relationships between functional, genetic, and taxonomic diversity. Diversity profiles in particular, show promise for comparing biological entities across the domains of life, provided similarity is defined on relevant scales. Given that the amount of molecular data being submitted to open-access databases is increasing exponentially, our results will help guide efforts to synthesize these large and complex datasets.