Suppr超能文献

检测和分析DNA测序错误:迈向更高质量的枯草芽孢杆菌基因组序列

Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence.

作者信息

Médigue C, Rose M, Viari A, Danchin A

机构信息

Institut Pasteur REG, F-75724 Paris Cedex 15, France. claudine.medigue @snv.jussieu.fr

出版信息

Genome Res. 1999 Nov;9(11):1116-27. doi: 10.1101/gr.9.11.1116.

Abstract

During the determination of a DNA sequence, the introduction of artifactual frameshifts and/or in-frame stop codons in putative genes can lead to misprediction of gene products. Detection of such errors with a method based on protein similarity matching is only possible when related sequences are available in databases. Here, we present a method to detect frameshift errors in DNA sequences that is based on the intrinsic properties of the coding sequences. It combines the results of two analyses, the search for translational initiation/termination sites and the prediction of coding regions. This method was used to screen the complete Bacillus subtilis genome sequence and the regions flanking putative errors were resequenced for verification. This procedure allowed us to correct the sequence and to analyze in detail the nature of the errors. Interestingly, in several cases in-frame termination codons or frameshifts were not sequencing errors but confirmed to be present in the chromosome, indicating that the genes are either nonfunctional (pseudogenes) or subject to regulatory processes such as programmed translational frameshifts. The method can be used for checking the quality of the sequences produced by any prokaryotic genome sequencing project.

摘要

在确定DNA序列的过程中,推定基因中人为引入的移码和/或框内终止密码子可能导致对基因产物的错误预测。只有当数据库中存在相关序列时,才有可能通过基于蛋白质相似性匹配的方法检测到此类错误。在此,我们提出一种基于编码序列的内在特性来检测DNA序列中移码错误的方法。它结合了两种分析结果,即翻译起始/终止位点的搜索和编码区的预测。该方法用于筛选完整的枯草芽孢杆菌基因组序列,并对推定错误两侧的区域进行重新测序以进行验证。这一过程使我们能够校正序列并详细分析错误的性质。有趣的是,在一些情况下,框内终止密码子或移码并非测序错误,而是被证实在染色体中存在,这表明这些基因要么无功能(假基因),要么受到如程序性翻译移码等调控过程的影响。该方法可用于检查任何原核生物基因组测序项目所产生序列的质量。

相似文献

3
Frame: detection of genomic sequencing errors.框架:基因组测序错误的检测
Bioinformatics. 1998;14(4):367-71. doi: 10.1093/bioinformatics/14.4.367.
4
[Whole Genome Sequence Determination and Analysis of Strain CGMCC 12426].[菌株CGMCC 12426的全基因组序列测定与分析]
Zhongguo Yi Xue Ke Xue Yuan Xue Bao. 2019 Jun 30;41(3):307-314. doi: 10.3881/j.issn.1000-503X.10370.

引用本文的文献

本文引用的文献

5
Frame: detection of genomic sequencing errors.框架:基因组测序错误的检测
Bioinformatics. 1998;14(4):367-71. doi: 10.1093/bioinformatics/14.4.367.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验