Suppr超能文献

解决 Illumina 测序数据生产和分析中的挑战。

Addressing challenges in the production and analysis of illumina sequencing data.

机构信息

Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics Deutscher Platz 6 04103 Leipzig, Germany.

出版信息

BMC Genomics. 2011 Jul 29;12:382. doi: 10.1186/1471-2164-12-382.

Abstract

Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project design, or which must be addressed during data analysis. Specialist skills, both at the laboratory and the computational stages of project design and analysis, are crucial to the generation of high quality data from these new platforms. The Illumina sequencers (including the Genome Analyzers I/II/IIe/IIx and the new HiScan and HiSeq) represent a widely used platform providing parallel readout of several hundred million immobilized sequences using fluorescent-dye reversible-terminator chemistry. Sequencing library quality, sample handling, instrument settings and sequencing chemistry have a strong impact on sequencing run quality. The presence of adapter chimeras and adapter sequences at the end of short-insert molecules, as well as increased error rates and short read lengths complicate many computational analyses. We discuss here some of the factors that influence the frequency and severity of these problems and provide solutions for circumventing these. Further, we present a set of general principles for good analysis practice that enable problems with sequencing runs to be identified and dealt with.

摘要

DNA 测序技术的进步使得快速且大大降低成本地生成大量序列数据成为可能,其速度和成本都远远优于毛细管测序。这些新技术具有特定的特征和限制,这就要求在项目设计的过程中加以考虑,或者在数据分析过程中加以解决。在项目设计和分析的实验室和计算阶段,专业技能都是从这些新平台生成高质量数据的关键。Illumina 测序仪(包括 Genome Analyzers I/II/IIe/IIx 以及新的 HiScan 和 HiSeq)是一种广泛使用的平台,它使用荧光可逆终止子化学技术平行读取数亿个固定序列。测序文库质量、样品处理、仪器设置和测序化学对测序运行质量有很大的影响。短插入分子末端的接头嵌合体和接头序列的存在,以及错误率的增加和短读长,使得许多计算分析变得复杂。在这里,我们讨论了一些影响这些问题的频率和严重程度的因素,并提供了一些解决方案来规避这些问题。此外,我们还提出了一套通用的分析实践原则,以便能够识别和处理测序运行中的问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38c9/3163567/90487d7d956f/1471-2164-12-382-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验