• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用蛋白质大语言模型的语义搜索可在细菌基因组中检测出II类微菌素。

Semantic search using protein large language models detects class II microcins in bacterial genomes.

作者信息

Kulikova Anastasiya V, Parker Jennifer K, Davies Bryan W, Wilke Claus O

机构信息

Department of Integrative Biology, University of Texas at Austin, Austin, Texas, USA.

Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX, USA.

出版信息

bioRxiv. 2023 Nov 15:2023.11.15.567263. doi: 10.1101/2023.11.15.567263.

DOI:10.1101/2023.11.15.567263
PMID:38014091
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10680697/
Abstract

Class II microcins are antimicrobial peptides that have shown some potential as novel antibiotics. However, to date only ten class II microcins have been described, and discovery of novel microcins has been hampered by their short length and high sequence divergence. Here, we ask if we can use numerical embeddings generated by protein large language models to detect microcins in bacterial genome assemblies and whether this method can outperform sequence-based methods such as BLAST. We find that embeddings detect known class II microcins much more reliably than does BLAST and that any two microcins tend to have a small distance in embedding space even though they typically are highly diverged at the sequence level. In datasets of , spp., and spp. genomes, we further find novel putative microcins that were previously missed by sequence-based search methods.

摘要

II类微小菌素是一类抗菌肽,已显示出作为新型抗生素的一些潜力。然而,迄今为止,仅描述了10种II类微小菌素,新型微小菌素的发现受到其长度短和序列高度分化的阻碍。在这里,我们探讨是否可以使用蛋白质大语言模型生成的数值嵌入来检测细菌基因组组装中的微小菌素,以及该方法是否优于基于序列的方法(如BLAST)。我们发现,嵌入检测已知II类微小菌素的可靠性远高于BLAST,并且任何两种微小菌素在嵌入空间中的距离往往较小,即使它们在序列水平上通常高度分化。在大肠杆菌、肺炎克雷伯菌和铜绿假单胞菌基因组数据集中,我们进一步发现了基于序列的搜索方法之前遗漏的新型假定微小菌素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/96ab85c9cff2/nihpp-2023.11.15.567263v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/a57234441280/nihpp-2023.11.15.567263v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/6b8f32ae37a0/nihpp-2023.11.15.567263v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/322455b1eabc/nihpp-2023.11.15.567263v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/c653c37c4167/nihpp-2023.11.15.567263v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/80892a631601/nihpp-2023.11.15.567263v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/96ab85c9cff2/nihpp-2023.11.15.567263v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/a57234441280/nihpp-2023.11.15.567263v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/6b8f32ae37a0/nihpp-2023.11.15.567263v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/322455b1eabc/nihpp-2023.11.15.567263v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/c653c37c4167/nihpp-2023.11.15.567263v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/80892a631601/nihpp-2023.11.15.567263v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e5/10680697/96ab85c9cff2/nihpp-2023.11.15.567263v1-f0006.jpg

相似文献

1
Semantic search using protein large language models detects class II microcins in bacterial genomes.使用蛋白质大语言模型的语义搜索可在细菌基因组中检测出II类微菌素。
bioRxiv. 2023 Nov 15:2023.11.15.567263. doi: 10.1101/2023.11.15.567263.
2
Semantic search using protein large language models detects class II microcins in bacterial genomes.基于蛋白质大型语言模型的语义搜索可在细菌基因组中检测到 II 类微菌素。
mSystems. 2024 Oct 22;9(10):e0104424. doi: 10.1128/msystems.01044-24. Epub 2024 Sep 18.
3
Evidence for Widespread Class II Microcins in Genomes.证据表明广泛存在于基因组中的 II 类微菌素。
Appl Environ Microbiol. 2022 Dec 13;88(23):e0148622. doi: 10.1128/aem.01486-22. Epub 2022 Nov 17.
4
Expanding the toolbox: Novel class IIb microcins show activity against Gram-negative ESKAPE and plant pathogens.拓展工具库:新型IIb类微菌素对革兰氏阴性ESKAPE病原体和植物病原体具有活性。
bioRxiv. 2024 Aug 29:2023.12.05.570296. doi: 10.1101/2023.12.05.570296.
5
Evaluating the Potential and Synergetic Effects of Microcins against Multidrug-Resistant .评估微菌素对抗多重耐药菌的潜力和协同效应。
Microbiol Spectr. 2022 Jun 29;10(3):e0275221. doi: 10.1128/spectrum.02752-21. Epub 2022 May 11.
6
Microcins in : Peptide Antimicrobials in the Eco-Active Intestinal Chemosphere.微菌素:生态活性肠道化学环境中的肽类抗菌剂
Front Microbiol. 2019 Oct 9;10:2261. doi: 10.3389/fmicb.2019.02261. eCollection 2019.
7
Bacteriocins to Thwart Bacterial Resistance in Gram Negative Bacteria.用于对抗革兰氏阴性菌耐药性的细菌素
Front Microbiol. 2020 Nov 9;11:586433. doi: 10.3389/fmicb.2020.586433. eCollection 2020.
8
Low-molecular-weight post-translationally modified microcins.低分子量翻译后修饰的微菌素
Mol Microbiol. 2007 Sep;65(6):1380-94. doi: 10.1111/j.1365-2958.2007.05874.x. Epub 2007 Aug 17.
9
Comparative analysis of chromosome-encoded microcins.染色体编码微菌素的比较分析
Antimicrob Agents Chemother. 2006 Apr;50(4):1411-8. doi: 10.1128/AAC.50.4.1411-1418.2006.
10
Siderophore-Microcins in : Determinants of Digestive Colonization, the First Step Toward Virulence.铁载体-微菌素在:消化定植的决定因素,向毒力迈出的第一步。
Front Cell Infect Microbiol. 2020 Aug 21;10:381. doi: 10.3389/fcimb.2020.00381. eCollection 2020.

本文引用的文献

1
PortPred: Exploiting deep learning embeddings of amino acid sequences for the identification of transporter proteins and their substrates.PortPred:利用氨基酸序列的深度学习嵌入物来识别转运蛋白及其底物。
J Cell Biochem. 2023 Nov;124(11):1803-1824. doi: 10.1002/jcb.30490. Epub 2023 Oct 25.
2
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
3
Improved global protein homolog detection with major gains in function identification.
提高全局蛋白质同源物检测的功能识别能力。
Proc Natl Acad Sci U S A. 2023 Feb 28;120(9):e2211823120. doi: 10.1073/pnas.2211823120. Epub 2023 Feb 24.
4
Using machine learning to predict the effects and consequences of mutations in proteins.利用机器学习预测蛋白质突变的影响和后果。
Curr Opin Struct Biol. 2023 Feb;78:102518. doi: 10.1016/j.sbi.2022.102518. Epub 2023 Jan 3.
5
Nearest neighbor search on embeddings rapidly identifies distant protein relations.对嵌入进行最近邻搜索可快速识别远距离蛋白质关系。
Front Bioinform. 2022 Nov 17;2:1033775. doi: 10.3389/fbinf.2022.1033775. eCollection 2022.
6
Evidence for Widespread Class II Microcins in Genomes.证据表明广泛存在于基因组中的 II 类微菌素。
Appl Environ Microbiol. 2022 Dec 13;88(23):e0148622. doi: 10.1128/aem.01486-22. Epub 2022 Nov 17.
7
Microcin MccI47 selectively inhibits enteric bacteria and reduces carbapenem-resistant colonization when administered an engineered live biotherapeutic.当给予工程化活体生物治疗剂时,微菌素 MccI47 选择性地抑制肠道细菌并减少碳青霉烯类耐药定植。
Gut Microbes. 2022 Jan-Dec;14(1):2127633. doi: 10.1080/19490976.2022.2127633.
8
Microcins reveal natural mechanisms of bacterial manipulation to inform therapeutic development.微菌素揭示了细菌操纵的自然机制,为治疗开发提供了信息。
Microbiology (Reading). 2022 Apr;168(4). doi: 10.1099/mic.0.001175.
9
Search and sequence analysis tools services from EMBL-EBI in 2022.2022 年 EMBL-EBI 的搜索和序列分析工具服务。
Nucleic Acids Res. 2022 Jul 5;50(W1):W276-W279. doi: 10.1093/nar/gkac240.
10
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.