Suppr超能文献

利用Spacedust在微生物基因组中从头发现保守基因簇。

De novo discovery of conserved gene clusters in microbial genomes with Spacedust.

作者信息

Zhang Ruoshi, Mirdita Milot, Söding Johannes

机构信息

Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany.

School of Biological Sciences, Seoul National University, Seoul, Republic of Korea.

出版信息

Nat Methods. 2025 Sep 15. doi: 10.1038/s41592-025-02816-x.

Abstract

Metagenomics has revolutionized environmental and human-associated microbiome studies. However, the limited fraction of proteins with known biological processes and molecular functions presents a major bottleneck. In prokaryotes and viruses, evolution favors keeping genes participating in the same biological processes colocalized as conserved gene clusters. Conversely, conservation of gene neighborhood indicates functional association. Here we present Spacedust, a tool for systematic, de novo discovery of conserved gene clusters. To find homologous protein matches, Spacedust uses fast and sensitive structure comparison with Foldseek. Partially conserved clusters are detected using novel clustering and order conservation P values. We demonstrate Spacedust's sensitivity with an all-versus-all analysis of 1,308 bacterial genomes, identifying 72,843 conserved gene clusters containing 58% of the 4.2 million genes. It recovered 95% of antiviral defense system clusters annotated by the specialized tool PADLOC. Spacedust's high sensitivity and speed will facilitate the annotation of large numbers of sequenced bacterial, archaeal and viral genomes.

摘要

宏基因组学彻底改变了环境微生物组和人类相关微生物组的研究。然而,具有已知生物学过程和分子功能的蛋白质比例有限,这是一个主要瓶颈。在原核生物和病毒中,进化倾向于将参与相同生物学过程的基因作为保守基因簇共定位。相反,基因邻域的保守性表明功能关联。在这里,我们展示了Spacedust,这是一种用于系统地、从头发现保守基因簇的工具。为了找到同源蛋白质匹配,Spacedust使用与Foldseek的快速且灵敏的结构比较。使用新颖的聚类和顺序保守P值检测部分保守的簇。我们通过对1308个细菌基因组进行全对全分析来证明Spacedust的灵敏度,识别出72843个保守基因簇,其中包含420万个基因中的58%。它恢复了专门工具PADLOC注释的95%的抗病毒防御系统簇。Spacedust的高灵敏度和速度将有助于对大量已测序的细菌、古细菌和病毒基因组进行注释。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验