Suppr超能文献

454 GS-FLX Titanium 焦磷酸测序准确性和质量评估。

Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing.

机构信息

Aix-Marseille Université, CNRS, IRD, UMR 6116 - IMEP, Equipe Evolution Génome Environnement, Centre Saint-Charles, Case 36, 3 place Victor Hugo, 13331 Marseille Cedex 3, France.

出版信息

BMC Genomics. 2011 May 19;12:245. doi: 10.1186/1471-2164-12-245.

Abstract

BACKGROUND

The rapid evolution of 454 GS-FLX sequencing technology has not been accompanied by a reassessment of the quality and accuracy of the sequences obtained. Current strategies for decision-making and error-correction are based on an initial analysis by Huse et al. in 2007, for the older GS20 system based on experimental sequences. We analyze here the quality of 454 sequencing data and identify factors playing a role in sequencing error, through the use of an extensive dataset for Roche control DNA fragments.

RESULTS

We obtained a mean error rate for 454 sequences of 1.07%. More importantly, the error rate is not randomly distributed; it occasionally rose to more than 50% in certain positions, and its distribution was linked to several experimental variables. The main factors related to error are the presence of homopolymers, position in the sequence, size of the sequence and spatial localization in PT plates for insertion and deletion errors. These factors can be described by considering seven variables. No single variable can account for the error rate distribution, but most of the variation is explained by the combination of all seven variables.

CONCLUSIONS

The pattern identified here calls for the use of internal controls and error-correcting base callers, to correct for errors, when available (e.g. when sequencing amplicons). For shotgun libraries, the use of both sequencing primers and deep coverage, combined with the use of random sequencing primer sites should partly compensate for even high error rates, although it may prove more difficult than previous thought to distinguish between low-frequency alleles and errors.

摘要

背景

454 GS-FLX 测序技术的快速发展并没有伴随着对所获得序列的质量和准确性的重新评估。目前的决策和纠错策略基于 Huse 等人在 2007 年对基于实验序列的旧 GS20 系统的初步分析。我们在这里通过使用罗氏控制 DNA 片段的广泛数据集来分析 454 测序数据的质量,并确定影响测序错误的因素。

结果

我们得到 454 序列的平均错误率为 1.07%。更重要的是,错误率不是随机分布的;它偶尔会在某些位置上升到 50%以上,其分布与几个实验变量有关。与错误相关的主要因素是存在同源多聚体、序列位置、序列大小以及插入和缺失错误的 PT 板中的空间定位。这些因素可以通过考虑七个变量来描述。没有单个变量可以解释错误率分布,但大多数变化可以通过组合所有七个变量来解释。

结论

这里确定的模式需要在可用时使用内部对照和纠错碱基调用程序来纠正错误(例如,在测序扩增子时)。对于鸟枪法文库,使用测序引物和深度覆盖,并结合使用随机测序引物位点,应部分补偿甚至高错误率,尽管与之前的想法相比,区分低频等位基因和错误可能更具挑战性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0026/3116506/bcf4382a61f0/1471-2164-12-245-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验