Ladner Jason T, Beitzel Brett, Chain Patrick S G, Davenport Matthew G, Donaldson Eric F, Frieman Matthew, Kugelman Jeffrey R, Kuhn Jens H, O'Rear Jules, Sabeti Pardis C, Wentworth David E, Wiley Michael R, Yu Guo-Yun, Sozhamannan Shanmuga, Bradburne Christopher, Palacios Gustavo
Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, Maryland, USA
Center for Genome Sciences, United States Army Medical Research Institute of Infectious Diseases, Fort Detrick, Maryland, USA.
mBio. 2014 Jun 17;5(3):e01360-14. doi: 10.1128/mBio.01360-14.
Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques.
得益于高通量测序技术,基因组测序已成为病毒研究几乎所有方面的常见组成部分;因此,我们正经历着可用基因组序列数量以及产生此类数据的机构数量的激增。然而,目前尚无通用标准来传达这些不同基因组序列的质量,进而其效用。在此,我们提出涵盖病毒基因组完成所有阶段的五个“标准”类别,并用与测序所用技术无关的简单标准对其进行定义。我们还针对各种下游应用提供了基因组完成建议,同时牢记与不同完成水平相关的成本效益权衡。我们的目标是定义一个通用词汇,以便能够比较不同研究组、测序平台和组装技术的基因组质量。