OOS 33-7
A Bayesian hierarchical approach for estimating microbial community compositions assessed using two sets of primers via polymerase chain reaction
Assessing microbial community composition and diversity in soils is an important task given their critical role in energy and nutrient cycling. The recent technological development of metagenomics with high-throughput sequencing makes it possible to investigate microbial communities in depth. However, it is still challenging to assess microbial community compositions and diversity with accuracy. One important factor affecting the accuracy is choice of primers for polymerase chain reaction (PCR), which is known to produce biased estimates of community composition. Using multiple pairs of primer sets in PCR can capture broader microbial members, but this raises another challenge: constructing relative abundances of microbial members obtained using different sets of primers. We introduce a Bayesian hierarchical approach for combining matched PCR products, each created from different PCR primers, to account for biasing introduced from each primer when measuring microbial community composition. We motivate the methodology with an application from soil microbial ecology: measuring the community composition of methanotrophic bacteria in soil samples collected across the Great Plains of the United States.
Results/Conclusions
This methodology provides improved estimates of microbial community composition by pooling information about primer biasing from across many samples, which would be impossible without the use of two or more primers. We will illustrate using the motivating application data how incredibly different ranked relative abundances can be between primers with key clades being completely missed depending on choice of primer (e.g., a de novo clade with second highest ranked abundance measured by one primer is never present when measured by another) and how our methodology reconciles these differences (e.g., determining said de novo clade should have fourth highest ranked abundance.). This modeling framework can be incorporated into larger Bayesian hierarchical models for relating soil-geochemistry to microbial community to ecosystem function. For example, a traits-based analysis for understanding how soil ammonium affects the composition of methanotroph communities across the Great Plains and how that influences soil methane flux.