Suppr超能文献

检测和分析DNA测序错误:迈向更高质量的枯草芽孢杆菌基因组序列

Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.

作者信息

Médigue C, Rose M, Viari A, Danchin A

机构信息

Institut Pasteur REG, F-75724 Paris Cedex 15, France. claudine.medigue @snv.jussieu.fr

出版信息

Genome Res. 1999 Nov;9(11):1116-27. doi: 10.1101/gr.9.11.1116.

Abstract

During the determination of a DNA sequence, the introduction of artifactual frameshifts and/or in-frame stop codons in putative genes can lead to misprediction of gene products. Detection of such errors with a method based on protein similarity matching is only possible when related sequences are available in databases. Here, we present a method to detect frameshift errors in DNA sequences that is based on the intrinsic properties of the coding sequences. It combines the results of two analyses, the search for translational initiation/termination sites and the prediction of coding regions. This method was used to screen the complete Bacillus subtilis genome sequence and the regions flanking putative errors were resequenced for verification. This procedure allowed us to correct the sequence and to analyze in detail the nature of the errors. Interestingly, in several cases in-frame termination codons or frameshifts were not sequencing errors but confirmed to be present in the chromosome, indicating that the genes are either nonfunctional (pseudogenes) or subject to regulatory processes such as programmed translational frameshifts. The method can be used for checking the quality of the sequences produced by any prokaryotic genome sequencing project.

摘要

在确定DNA序列的过程中,推定基因中人为引入的移码和/或框内终止密码子可能导致对基因产物的错误预测。只有当数据库中存在相关序列时,才有可能通过基于蛋白质相似性匹配的方法检测到此类错误。在此,我们提出一种基于编码序列的内在特性来检测DNA序列中移码错误的方法。它结合了两种分析结果,即翻译起始/终止位点的搜索和编码区的预测。该方法用于筛选完整的枯草芽孢杆菌基因组序列,并对推定错误两侧的区域进行重新测序以进行验证。这一过程使我们能够校正序列并详细分析错误的性质。有趣的是,在一些情况下,框内终止密码子或移码并非测序错误,而是被证实在染色体中存在,这表明这些基因要么无功能(假基因),要么受到如程序性翻译移码等调控过程的影响。该方法可用于检查任何原核生物基因组测序项目所产生序列的质量。

相似文献

3
Frame: detection of genomic sequencing errors.框架:基因组测序错误的检测
Bioinformatics. 1998;14(4):367-71. doi: 10.1093/bioinformatics/14.4.367.
4
[Whole Genome Sequence Determination and Analysis of Strain CGMCC 12426].[菌株CGMCC 12426的全基因组序列测定与分析]
Zhongguo Yi Xue Ke Xue Yuan Xue Bao. 2019 Jun 30;41(3):307-314. doi: 10.3881/j.issn.1000-503X.10370.

引用本文的文献

本文引用的文献

5
Frame: detection of genomic sequencing errors.框架:基因组测序错误的检测
Bioinformatics. 1998;14(4):367-71. doi: 10.1093/bioinformatics/14.4.367.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验