Suppr超能文献

教程:使用机器学习方法挖掘基因组和蛋白质组以发现抗生素的指南。

Tutorial: guidelines for the use of machine learning methods to mine genomes and proteomes for antibiotic discovery.

作者信息

Wan Fangping, Torres Marcelo D T, Guan Changge, de la Fuente-Nunez Cesar

机构信息

Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Nat Protoc. 2025 May 14. doi: 10.1038/s41596-025-01144-w.

Abstract

Genomes and proteomes constitute a rich reservoir of molecular diversity. However, they have remained underexplored because of a lack of appropriate tools. In recent years, computational approaches have been developed to mine this unexplored biological information, or dark matter, accelerating the discovery of new antibiotic molecules. Such efforts have yielded a wide range of new molecules. These include peptides released via predicted proteolytic cleavage of larger proteins, termed 'encrypted peptides', which have been found to be widespread in nature. Molecules encoded by and translated from small open reading frames within genomic sequences have also been uncovered, further expanding the landscape of bioactive compounds. Here, we discuss computational approaches, including machine learning and artificial intelligence (AI) tools, which have been used to date to identify antimicrobial compounds, with a special emphasis on peptides. We also propose potential avenues for future exploration in this rapidly evolving field. Moreover, we provide an overview of the experimental methods commonly used to validate these computational predictions. We anticipate that efforts combining cutting-edge AI and experimental approaches for biological sequence mining will reveal new insights into host immunity and continue to accelerate discoveries in the fields of antibiotics and infectious diseases.

摘要

基因组和蛋白质组构成了丰富的分子多样性宝库。然而,由于缺乏合适的工具,它们仍未得到充分探索。近年来,已开发出计算方法来挖掘这些未被探索的生物信息,即暗物质,从而加速新抗生素分子的发现。这些努力已经产生了各种各样的新分子。其中包括通过预测较大蛋白质的蛋白水解切割而释放的肽,称为“加密肽”,已发现其在自然界中广泛存在。基因组序列内小开放阅读框编码并翻译的分子也已被发现,进一步拓展了生物活性化合物的范围。在这里,我们讨论了计算方法,包括机器学习和人工智能(AI)工具,这些工具迄今为止已被用于识别抗菌化合物,特别强调了肽。我们还提出了在这个快速发展的领域中未来探索的潜在途径。此外,我们概述了常用于验证这些计算预测的实验方法。我们预计,将前沿人工智能与生物序列挖掘的实验方法相结合的努力,将揭示宿主免疫的新见解,并继续加速抗生素和传染病领域的发现。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验