Suppr超能文献

基因组序列错误的自动校正

Automated correction of genome sequence errors.

作者信息

Gajer Pawel, Schatz Michael, Salzberg Steven L

机构信息

The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA.

出版信息

Nucleic Acids Res. 2004 Jan 26;32(2):562-9. doi: 10.1093/nar/gkh216. Print 2004.

Abstract

By using information from an assembly of a genome, a new program called AutoEditor significantly improves base calling accuracy over that achieved by previous algorithms. This in turn improves the overall accuracy of genome sequences and facilitates the use of these sequences for polymorphism discovery. We describe the algorithm and its application in a large set of recent genome sequencing projects. The number of erroneous base calls in these projects was reduced by 80%. In an analysis of over one million corrections, we found that AutoEditor made just one error per 8828 corrections. By substantially increasing the accuracy of base calling, AutoEditor can dramatically accelerate the process of finishing genomes, which involves closing all gaps and ensuring minimum quality standards for the final sequence. It also greatly improves our ability to discover single nucleotide polymorphisms (SNPs) between closely related strains and isolates of the same species.

摘要

通过使用基因组组装的信息,一个名为AutoEditor的新程序显著提高了碱基识别准确性,超过了之前算法所达到的水平。这进而提高了基因组序列的整体准确性,并促进了这些序列在多态性发现中的应用。我们描述了该算法及其在大量近期基因组测序项目中的应用。这些项目中错误碱基识别的数量减少了80%。在对超过一百万次校正的分析中,我们发现AutoEditor每8828次校正仅出现一次错误。通过大幅提高碱基识别的准确性,AutoEditor可以显著加速完成基因组的过程,这包括填补所有缺口并确保最终序列的最低质量标准。它还极大地提高了我们发现同一物种密切相关菌株和分离株之间单核苷酸多态性(SNP)的能力。

相似文献

1
Automated correction of genome sequence errors.基因组序列错误的自动校正
Nucleic Acids Res. 2004 Jan 26;32(2):562-9. doi: 10.1093/nar/gkh216. Print 2004.

引用本文的文献

1
On the nature and types of anomalies: a review of deviations in data.论异常的性质与类型:数据偏差综述
Int J Data Sci Anal. 2021;12(4):297-331. doi: 10.1007/s41060-021-00265-1. Epub 2021 Aug 4.
3
Comparative genomic analysis of eutherian adiponectin genes.真兽亚纲脂联素基因的比较基因组分析
Heliyon. 2018 Jun 6;4(6):e00647. doi: 10.1016/j.heliyon.2018.e00647. eCollection 2018 Jun.
4
Comparative genomic analysis of eutherian kallikrein genes.真兽类激肽释放酶基因的比较基因组分析。
Mol Genet Metab Rep. 2017 Feb 3;10:96-99. doi: 10.1016/j.ymgmr.2017.01.009. eCollection 2017 Mar.
10
Biological agent detection technologies.生物制剂检测技术。
Mol Ecol Resour. 2009 May;9 Suppl s1(Suppl 1):51-7. doi: 10.1111/j.1755-0998.2009.02632.x.

本文引用的文献

1
Correcting errors in shotgun sequences.校正鸟枪法测序中的错误。
Nucleic Acids Res. 2003 Aug 1;31(15):4663-72. doi: 10.1093/nar/gkg653;.
5
An Eulerian path approach to DNA fragment assembly.一种用于DNA片段组装的欧拉路径方法。
Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53. doi: 10.1073/pnas.171285098.
7
Variation is the spice of life.变化是生活的调味品。
Nat Genet. 2001 Mar;27(3):234-6. doi: 10.1038/85776.
8
A whole-genome assembly of Drosophila.果蝇的全基因组组装
Science. 2000 Mar 24;287(5461):2196-204. doi: 10.1126/science.287.5461.2196.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验