Department of Marine Sciences, University of Georgia, Athens, GA 30602-3636, USA.
ISME J. 2011 Mar;5(3):461-72. doi: 10.1038/ismej.2010.141. Epub 2010 Sep 16.
The potential of metatranscriptomic sequencing to provide insights into the environmental factors that regulate microbial activities depends on how fully the sequence libraries capture community expression (that is, sample-sequencing depth and coverage depth), and the sensitivity with which expression differences between communities can be detected (that is, statistical power for hypothesis testing). In this study, we use an internal standard approach to make absolute (per liter) estimates of transcript numbers, a significant advantage over proportional estimates that can be biased by expression changes in unrelated genes. Coastal waters of the southeastern United States contain 1 × 10(12) bacterioplankton mRNA molecules per liter of seawater (200 mRNA molecules per bacterial cell). Even for the large bacterioplankton libraries obtained in this study (500,000 possible protein-encoding sequences in each of two libraries after discarding rRNAs and small RNAs from >1 million 454 FLX pyrosequencing reads), sample-sequencing depth was only 0.00001%. Expression levels of 82 genes diagnostic for transformations in the marine nitrogen, phosphorus and sulfur cycles ranged from below detection (<1 × 10(6) transcripts per liter) for 36 genes (for example, phosphonate metabolism gene phnH, dissimilatory nitrate reductase subunit napA) to >2.7 × 10(9) transcripts per liter (ammonia transporter amt and ammonia monooxygenase subunit amoC). Half of the categories for which expression was detected, however, had too few copy numbers for robust statistical resolution, as would be required for comparative (experimental or time-series) expression studies. By representing whole community gene abundance and expression in absolute units (per volume or mass of environment), 'omics' data can be better leveraged to improve understanding of microbially mediated processes in the ocean.
宏转录组测序在提供有关调节微生物活动的环境因素的见解方面具有潜力,这取决于序列文库在多大程度上全面捕获群落表达(即样本测序深度和覆盖深度),以及群落之间表达差异的检测灵敏度(即用于假设检验的统计功效)。在这项研究中,我们使用内部标准方法来对转录物数量进行绝对(每升)估计,这与比例估计相比具有显著优势,比例估计可能会因无关基因的表达变化而产生偏差。美国东南部沿海海域每升海水中含有 1×10(12)个细菌浮游生物 mRNA 分子(每细菌细胞约有 200 个 mRNA 分子)。即使对于本研究中获得的大型细菌浮游生物文库(两个文库中每个文库丢弃 rRNA 和小 RNA 后,有>100 万个 454 FLX 焦磷酸测序读数中可能有 50 万个蛋白质编码序列),样本测序深度也仅为 0.00001%。海洋氮、磷和硫循环转化的 82 个基因的表达水平从低于检测下限(<1×10(6)个转录物/升)的 36 个基因(例如,膦酸盐代谢基因 phnH,异化硝酸盐还原酶亚基 napA)到>2.7×10(9)个转录物/升(氨转运蛋白 amt 和氨单加氧酶亚基 amoC)。然而,在检测到表达的一半类别中,由于需要进行比较(实验或时间序列)表达研究,因此其拷贝数太少,无法进行稳健的统计解析。通过以绝对单位(每环境体积或质量)表示整个群落的基因丰度和表达,可以更好地利用“组学”数据来提高对海洋中微生物介导过程的理解。