The Genome Center at Washington University, St. Louis, Missouri 63108, USA.
Brief Bioinform. 2010 Sep;11(5):484-98. doi: 10.1093/bib/bbq016. Epub 2010 Jun 2.
Massively parallel sequencing technologies continue to alter the study of human genetics. As the cost of sequencing declines, next-generation sequencing (NGS) instruments and datasets will become increasingly accessible to the wider research community. Investigators are understandably eager to harness the power of these new technologies. Sequencing human genomes on these platforms, however, presents numerous production and bioinformatics challenges. Production issues like sample contamination, library chimaeras and variable run quality have become increasingly problematic in the transition from technology development lab to production floor. Analysis of NGS data, too, remains challenging, particularly given the short-read lengths (35-250 bp) and sheer volume of data. The development of streamlined, highly automated pipelines for data analysis is critical for transition from technology adoption to accelerated research and publication. This review aims to describe the state of current NGS technologies, as well as the strategies that enable NGS users to characterize the full spectrum of DNA sequence variation in humans.
大规模平行测序技术不断改变人类遗传学研究。随着测序成本的降低,下一代测序(NGS)仪器和数据集将越来越容易被更广泛的研究界所获取。研究人员渴望利用这些新技术的力量,这是可以理解的。然而,在这些平台上对人类基因组进行测序,会带来许多生产和生物信息学方面的挑战。从技术开发实验室到生产车间的转变过程中,样品污染、文库嵌合体和可变运行质量等生产问题变得越来越棘手。NGS 数据分析仍然具有挑战性,特别是考虑到其短读长(35-250bp)和数据量巨大。开发用于数据分析的简化、高度自动化的流程对于从技术采用过渡到加速研究和出版至关重要。本文旨在描述当前 NGS 技术的现状,以及使 NGS 用户能够描述人类全基因组序列变异特征的策略。