Suppr超能文献

重新审视大肠杆菌的 Y 染色体组。

Revisiting the y-ome of Escherichia coli.

机构信息

Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, CA, 94025 USA.

Department of Microbiology, Blavatnik Institute, Harvard Medical School, 25 Shattuck Street, Boston, MA, 02115 USA.

出版信息

Nucleic Acids Res. 2024 Nov 11;52(20):12201-12207. doi: 10.1093/nar/gkae857.

Abstract

The model organism Escherichia coli K-12 has one of the most extensively annotated genomes in terms of functional characterization, yet a significant number of genes, ∼35%, are still considered poorly characterized. Initially genes without known functional understanding were given 'y' gene names. However, due to inconsistency in changing 'y' names to non-'y' names over the years, gene name alone does not provide sufficient information as to the characterization level of genes. Attempts to characterize y-ome genes, i.e. those that lack experimental evidence for function, are ongoing, and recent categorization based on the level of experimental evidence has helped clarify those genes that are well characterized versus uncharacterized. EcoCyc, the most comprehensive, curated genome database for E. coli K-12 substr. MG1655, has updated this approach by expanding the categories to include Partially characterized genes using a set of computational rules that includes keywords, experimental evidence codes and Gene Ontology terms. Approximately half of the previously categorized y-ome genes are now categorized as Partially characterized, leaving 15.5% (738) as Uncharacterized genes in EcoCyc. This new categorization scheme is searchable in the EcoCyc database, will be updated as new experimental evidence is curated and provides important information for research decisions.

摘要

模式生物大肠杆菌 K-12 是功能特征方面注释最详尽的基因组之一,但仍有相当数量的基因(约 35%)被认为特征描述不足。最初,那些功能未知的基因被赋予了“y”基因名称。然而,由于多年来将“y”名称更改为非“y”名称的不一致性,仅凭基因名称并不能充分了解基因的特征描述水平。目前正在尝试对 y 基因组基因(即缺乏功能实验证据的基因)进行特征描述,最近基于实验证据水平的分类有助于澄清那些特征描述良好的基因与未特征描述的基因。EcoCyc 是最全面、经过精心整理的大肠杆菌 K-12 亚种 MG1655 基因组数据库,它通过使用一组包括关键词、实验证据代码和基因本体论术语的计算规则,将类别扩展到包括部分特征描述的基因,从而更新了这一方法。大约一半以前归类为 y 基因组的基因现在被归类为部分特征描述,而在 EcoCyc 中仍有 15.5%(738 个)被归类为未特征描述的基因。这种新的分类方案可在 EcoCyc 数据库中进行搜索,随着新的实验证据的整理,它将不断更新,并为研究决策提供重要信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca89/11551758/230dbc4e9b0f/gkae857figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验