• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无注释的微生物需氧利用预测。

Annotation-free prediction of microbial dioxygen utilization.

机构信息

Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA.

Division of Geological & Planetary Sciences, California Institute of Technology, Pasadena, California, USA.

出版信息

mSystems. 2024 Oct 22;9(10):e0076324. doi: 10.1128/msystems.00763-24. Epub 2024 Sep 4.

DOI:10.1128/msystems.00763-24
PMID:39230322
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11494890/
Abstract

Aerobes require dioxygen (O) to grow; anaerobes do not. However, nearly all microbes-aerobes, anaerobes, and facultative organisms alike-express enzymes whose substrates include O, if only for detoxification. This presents a challenge when trying to assess which organisms are aerobic from genomic data alone. This challenge can be overcome by noting that O utilization has wide-ranging effects on microbes: aerobes typically have larger genomes encoding distinctive O-utilizing enzymes, for example. These effects permit high-quality prediction of O utilization from annotated genome sequences, with several models displaying ≈80% accuracy on a ternary classification task for which blind guessing is only 33% accurate. Since genome annotation is compute-intensive and relies on many assumptions, we asked if annotation-free methods also perform well. We discovered that simple and efficient models based entirely on genomic sequence content-e.g., triplets of amino acids-perform as well as intensive annotation-based classifiers, enabling rapid processing of genomes. We further show that amino acid trimers are useful because they encode information about protein composition and phylogeny. To showcase the utility of rapid prediction, we estimated the prevalence of aerobes and anaerobes in diverse natural environments cataloged in the Earth Microbiome Project. Focusing on a well-studied O gradient in the Black Sea, we found quantitative correspondence between local chemistry (O:sulfide concentration ratio) and the composition of microbial communities. We, therefore, suggest that statistical methods like ours might be used to estimate, or "sense," pivotal features of the chemical environment using DNA sequencing data.IMPORTANCEWe now have access to sequence data from a wide variety of natural environments. These data document a bewildering diversity of microbes, many known only from their genomes. Physiology-an organism's capacity to engage metabolically with its environment-may provide a more useful lens than taxonomy for understanding microbial communities. As an example of this broader principle, we developed algorithms that accurately predict microbial dioxygen utilization directly from genome sequences without annotating genes, e.g., by considering only the amino acids in protein sequences. Annotation-free algorithms enable rapid characterization of natural samples, highlighting quantitative correspondence between sequences and local O levels in a data set from the Black Sea. This example suggests that DNA sequencing might be repurposed as a multi-pronged chemical sensor, estimating concentrations of O and other key facets of complex natural settings.

摘要

需氧生物的生长需要氧气 (O);厌氧生物则不需要。然而,几乎所有的微生物——需氧生物、厌氧生物和兼性生物——都表达了其底物包括 O 的酶,如果只是为了解毒的话。这在试图仅从基因组数据评估哪些生物是需氧生物时带来了挑战。通过注意到 O 的利用对微生物有广泛的影响,可以克服这一挑战:例如,需氧生物通常具有更大的基因组,编码独特的 O 利用酶。这些影响允许从注释的基因组序列中对 O 的利用进行高质量预测,有几个模型在一项三元分类任务中的准确率约为 80%,而盲目猜测的准确率仅为 33%。由于基因组注释计算密集且依赖于许多假设,我们想知道无注释方法是否也能很好地执行。我们发现,完全基于基因组序列内容(例如,氨基酸三联体)的简单而有效的模型与基于密集注释的分类器一样出色,从而能够快速处理基因组。我们还表明,氨基酸三联体很有用,因为它们编码了有关蛋白质组成和系统发育的信息。为了展示快速预测的实用性,我们估计了地球微生物组计划中编目的各种自然环境中好氧生物和厌氧生物的流行程度。我们专注于黑海的一个研究充分的 O 梯度,发现局部化学物质(O:硫化物浓度比)与微生物群落的组成之间存在定量对应关系。因此,我们建议可以使用像我们这样的统计方法来使用 DNA 测序数据估计(或“感知”)化学环境的关键特征。

重要性

我们现在可以访问来自各种自然环境的序列数据。这些数据记录了令人眼花缭乱的微生物多样性,其中许多仅从它们的基因组中得知。与分类学相比,生理学——生物体与环境进行代谢相互作用的能力——可能为理解微生物群落提供了一个更有用的视角。作为这一更广泛原则的一个例子,我们开发了算法,可以直接从基因组序列中准确预测微生物的氧气利用情况,而无需对基因进行注释,例如,仅考虑蛋白质序列中的氨基酸。无注释的算法可以实现对自然样本的快速特征描述,突出了黑海数据集序列和局部 O 水平之间的定量对应关系。这个例子表明,DNA 测序可以被重新用作一种多方面的化学传感器,估计氧气和复杂自然环境的其他关键方面的浓度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c57b/11494890/e44a472b12cb/msystems.00763-24.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c57b/11494890/0d8686ae5600/msystems.00763-24.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c57b/11494890/e44a472b12cb/msystems.00763-24.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c57b/11494890/0d8686ae5600/msystems.00763-24.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c57b/11494890/e44a472b12cb/msystems.00763-24.f002.jpg

相似文献

1
Annotation-free prediction of microbial dioxygen utilization.无注释的微生物需氧利用预测。
mSystems. 2024 Oct 22;9(10):e0076324. doi: 10.1128/msystems.00763-24. Epub 2024 Sep 4.
2
3
4
5
How to define obligatory anaerobiosis? An evolutionary view on the antioxidant response system and the early stages of the evolution of life on Earth.如何定义严格厌氧菌?从抗氧化反应系统和地球生命早期演化看。
Free Radic Biol Med. 2019 Aug 20;140:61-73. doi: 10.1016/j.freeradbiomed.2019.03.004. Epub 2019 Mar 9.
6
Electron Transfer to Nitrogenase in Different Genomic and Metabolic Backgrounds.在不同基因组和代谢背景下向固氮酶转移电子。
J Bacteriol. 2018 Apr 24;200(10). doi: 10.1128/JB.00757-17. Print 2018 May 15.
7
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
8
The number and type of oxygen-utilizing enzymes indicates aerobic vs. anaerobic phenotype.利用氧气的酶的数量和类型表明了需氧表型与厌氧表型。
Free Radic Biol Med. 2019 Aug 20;140:84-92. doi: 10.1016/j.freeradbiomed.2019.03.031. Epub 2019 Mar 29.
9
Repertoire-wide gene structure analyses: a case study comparing automatically predicted and manually annotated gene models.全面的基因结构分析:自动预测和手动注释基因模型的比较案例研究。
BMC Genomics. 2019 Oct 17;20(1):753. doi: 10.1186/s12864-019-6064-8.
10
Comprehensive Functional Annotation of Metagenomes and Microbial Genomes Using a Deep Learning-Based Method.基于深度学习的宏基因组和微生物组综合功能注释。
mSystems. 2023 Apr 27;8(2):e0117822. doi: 10.1128/msystems.01178-22. Epub 2023 Mar 7.

引用本文的文献

1
Leveraging genomic information to predict environmental preferences of bacteria.利用基因组信息预测细菌的环境偏好。
ISME J. 2024 Jan 8;18(1). doi: 10.1093/ismejo/wrae195.

本文引用的文献

1
The phenotype and genotype of fermentative prokaryotes.发酵原核生物的表型和基因型。
Sci Adv. 2023 Sep 29;9(39):eadg8687. doi: 10.1126/sciadv.adg8687. Epub 2023 Sep 27.
2
Genome content predicts the carbon catabolic preferences of heterotrophic bacteria.基因组内容预测异养细菌的碳分解代谢偏好。
Nat Microbiol. 2023 Oct;8(10):1799-1808. doi: 10.1038/s41564-023-01458-z. Epub 2023 Aug 31.
3
Community- and genome-based evidence for a shaping influence of redox potential on bacterial protein evolution.基于群落和基因组的证据表明,氧化还原电位对细菌蛋白质进化具有塑造作用。
mSystems. 2023 Jun 29;8(3):e0001423. doi: 10.1128/msystems.00014-23. Epub 2023 Jun 8.
4
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
5
Large-scale k-mer-based analysis of the informational properties of genomes, comparative genomics and taxonomy.大规模基于 k-mer 的基因组信息特性分析、比较基因组学和分类学。
PLoS One. 2021 Oct 14;16(10):e0258693. doi: 10.1371/journal.pone.0258693. eCollection 2021.
6
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
7
When anaerobes encounter oxygen: mechanisms of oxygen toxicity, tolerance and defence.当厌氧菌遇到氧气时:氧气毒性、耐受性和防御机制。
Nat Rev Microbiol. 2021 Dec;19(12):774-785. doi: 10.1038/s41579-021-00583-y. Epub 2021 Jun 28.
8
Functional prediction of environmental variables using metabolic networks.利用代谢网络进行环境变量的功能预测。
Sci Rep. 2021 Jun 9;11(1):12192. doi: 10.1038/s41598-021-91486-8.
9
A genomic catalog of Earth's microbiomes.地球微生物组的基因组目录。
Nat Biotechnol. 2021 Apr;39(4):499-509. doi: 10.1038/s41587-020-0718-6. Epub 2020 Nov 9.
10
Resource conservation manifests in the genetic code.资源保护体现在遗传密码中。
Science. 2020 Nov 6;370(6517):683-687. doi: 10.1126/science.aaz9642.