Pound Helena L, Gann Eric R, Wilhelm Steven W
Department of Microbiology, University of Tennessee, Knoxville, TN, 37996, USA.
Limnol Oceanogr Methods. 2021 Dec;19(12):846-854. doi: 10.1002/lom3.10465. Epub 2021 Nov 8.
Harmful algal blooms are increasing in duration and severity globally, resulting in increased research interest. The use of genetic sequencing technologies has provided a wealth of opportunity to advance knowledge, but also poses a risk to that knowledge if handled incorrectly. The vast numbers of sequence processing tools and protocols provide a method to test nearly every hypothesis, but each method has inherent strengths and weaknesses. Here, we tested six methods to classify and quantify metatranscriptomic activity from a harmful algal bloom dominated by spp. Three online tools were evaluated (Kaiju, MG-RAST, and GhostKOALA) in addition to three local tools that included a command line BLASTx approach, recruitment of reads to individual genomes, and recruitment to a combined composite genome generated from sequenced isolates with complete, closed genomes. Based on the analysis of each tool presented in this study, two recommendations are made that are dependent on the hypothesis to be tested. For researchers only interested in the function and physiology of spp., read recruitments to the composite genome, referred to as "Frankenstein's ", provided the highest total estimates of transcript expression. However, for researchers interested in the entire bloom microbiome, the online GhostKOALA annotation tool, followed by subsequent read recruitments, provided functional and taxonomic characterization, in addition to transcript expression estimates. This study highlights the critical need for careful evaluation of methods before data analysis.
有害藻华在全球范围内的持续时间和严重程度都在增加,这引发了更多的研究兴趣。基因测序技术的应用为知识的推进提供了大量机会,但如果处理不当,也会给这些知识带来风险。大量的序列处理工具和协议提供了一种检验几乎所有假设的方法,但每种方法都有其固有的优缺点。在这里,我们测试了六种方法来对以 属为主的有害藻华中的宏转录组活性进行分类和定量。除了三种本地工具外,还评估了三种在线工具(Kaiju、MG-RAST和GhostKOALA),这三种本地工具包括一种命令行BLASTx方法、将 reads 招募到单个 基因组以及招募到由具有完整封闭基因组的测序分离株生成的组合复合基因组。基于本研究中对每个工具的分析,根据要检验的假设提出了两条建议。对于只对 属的功能和生理学感兴趣的研究人员,将 reads 招募到复合基因组(称为“科学怪人基因组”)可提供最高的转录本表达总量估计。然而,对于对整个藻华微生物群落感兴趣的研究人员,在线的GhostKOALA注释工具以及随后的 reads 招募除了提供转录本表达估计外,还提供了功能和分类学特征。这项研究强调了在数据分析之前仔细评估方法的迫切需求。