Suppr超能文献

结核分枝杆菌H37Rv基因组序列的重新注释

Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv.

作者信息

Camus Jean-Christophe, Pryor Melinda J, Médigue Claudine, Cole Stewart T

机构信息

Annotation-Bases de Données (PT4), Génopole, Institut Pasteur, Paris, France2.

Unité de Génétique Moléculaire Bactérienne, Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex, France1.

出版信息

Microbiology (Reading). 2002 Oct;148(Pt 10):2967-2973. doi: 10.1099/00221287-148-10-2967.

Abstract

Original genome annotations need to be regularly updated if the information they contain is to remain accurate and relevant. Here the complete re-annotation of the genome sequence of Mycobacterium tuberculosis strain H37Rv is presented almost 4 years after the first submission. Eighty-two new protein-coding sequences (CDS) have been included and 22 of these have a predicted function. The majority were identified by manual or automated re-analysis of the genome and most of them were shorter than the 100 codon cut-off used in the initial genome analysis. The functional classification of 643 CDS has been changed based principally on recent sequence comparisons and new experimental data from the literature. More than 300 gene names and over 1000 targeted citations have been added and the lengths of 60 genes have been modified. Presently, it is possible to assign a function to 2058 proteins (52% of the 3995 proteins predicted) and only 376 putative proteins share no homology with known proteins and thus could be unique to M. tuberculosis.

摘要

如果原始基因组注释所包含的信息要保持准确和相关,就需要定期更新。本文展示了结核分枝杆菌H37Rv菌株基因组序列在首次提交近4年后的完整重新注释。已纳入82个新的蛋白质编码序列(CDS),其中22个具有预测功能。大多数是通过对基因组进行手动或自动重新分析鉴定出来的,并且它们中的大多数比初始基因组分析中使用的100个密码子的截断长度要短。643个CDS的功能分类主要基于最近的序列比较和文献中的新实验数据而发生了变化。已添加了300多个基因名称和1000多条目标引用,并且修改了60个基因的长度。目前,有可能为2058种蛋白质(占预测的3995种蛋白质的52%)赋予功能,只有376种推定蛋白质与已知蛋白质没有同源性,因此可能是结核分枝杆菌特有的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验