Suppr超能文献

仅使用基于基因组的特征预测原核生物的最佳生长温度。

Predicting the optimal growth temperatures of prokaryotes using only genome derived features.

机构信息

Department of Cell Biology, and The Helen L. and Martin S. Kimmel Center for Biology and Medicine, Skirball Institute of Biomolecular Medicine, New York University School of Medicine, New York, New York, USA.

出版信息

Bioinformatics. 2019 Sep 15;35(18):3224-3231. doi: 10.1093/bioinformatics/btz059.

Abstract

MOTIVATION

Optimal growth temperature is a fundamental characteristic of all living organisms. Knowledge of this temperature is central to the study of a prokaryote, the thermal stability and temperature dependent activity of its genes, and the bioprospecting of its genome for thermally adapted proteins. While high throughput sequencing methods have dramatically increased the availability of genomic information, the growth temperatures of the source organisms are often unknown. This limits the study and technological application of these species and their genomes. Here, we present a novel method for the prediction of growth temperatures of prokaryotes using only genomic sequences.

RESULTS

By applying the reverse ecology principle that an organism's genome includes identifiable adaptations to its native environment, we can predict a species' optimal growth temperature with an accuracy of 5.17°C root-mean-square error and a coefficient of determination of 0.835. The accuracy can be further improved for specific taxonomic clades or by excluding psychrophiles. This method provides a valuable tool for the rapid calculation of organism growth temperature when only the genome sequence is known.

AVAILABILITY AND IMPLEMENTATION

Source code, genomes analyzed and features calculated are available at: https://github.com/DavidBSauer/OGT_prediction.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

最适生长温度是所有生物的基本特征。了解这一温度是研究原核生物的核心,它的基因的热稳定性和温度依赖性活性,以及对其基因组进行耐热适应蛋白的生物勘探。虽然高通量测序方法极大地增加了基因组信息的可用性,但来源生物体的生长温度通常是未知的。这限制了对这些物种及其基因组的研究和技术应用。在这里,我们提出了一种仅使用基因组序列预测原核生物生长温度的新方法。

结果

通过应用反向生态学原理,即生物体的基因组包括对其天然环境的可识别适应,我们可以以 5.17°C 的均方根误差和 0.835 的决定系数准确预测物种的最适生长温度。对于特定的分类群或排除嗜冷菌,可以进一步提高准确性。当仅知道基因组序列时,该方法为快速计算生物体生长温度提供了一个有价值的工具。

可用性和实现

源代码、分析的基因组和计算的特征可在 https://github.com/DavidBSauer/OGT_prediction 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

8
ARCS: scaffolding genome drafts with linked reads.ARCS:使用链接读取构建基因组草图。
Bioinformatics. 2018 Mar 1;34(5):725-731. doi: 10.1093/bioinformatics/btx675.

引用本文的文献

1
The defensome of prokaryotes in aquifers.含水层中 prokaryotes 的防御组
Nat Commun. 2025 Jul 14;16(1):6482. doi: 10.1038/s41467-025-61467-w.

本文引用的文献

2
Evolutionary drivers of thermoadaptation in enzyme catalysis.酶催化中热适应的进化驱动因素。
Science. 2017 Jan 20;355(6322):289-294. doi: 10.1126/science.aah3717. Epub 2016 Dec 22.
3
GenBank.基因银行
Nucleic Acids Res. 2017 Jan 4;45(D1):D37-D42. doi: 10.1093/nar/gkw1070. Epub 2016 Nov 28.
4
Centrifuge: rapid and sensitive classification of metagenomic sequences.离心机:宏基因组序列的快速灵敏分类
Genome Res. 2016 Dec;26(12):1721-1729. doi: 10.1101/gr.210641.116. Epub 2016 Oct 17.
7
Ensembl Genomes 2016: more genomes, more complexity.《Ensembl基因组2016:更多基因组,更多复杂性》
Nucleic Acids Res. 2016 Jan 4;44(D1):D574-80. doi: 10.1093/nar/gkv1209. Epub 2015 Nov 17.
9
BacDive--The Bacterial Diversity Metadatabase in 2016.BacDive——2016年细菌多样性元数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D581-5. doi: 10.1093/nar/gkv983. Epub 2015 Sep 30.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验