Suppr超能文献

通过双曲庞加莱图嵌入和大规模蛋白质语言技术预测噬菌体-宿主相互作用。

Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique.

作者信息

Pan Jie, Wang Rui, Liu Wenjing, Wang Li, You Zhuhong, Li Yuechao, Duan Zhemeng, Huang Qinghua, Feng Jie, Sun Yanmei, Wang Shiwei

机构信息

Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China.

Department of Ophthalmology, The First Affiliated Hospital of Northwest University, 30 Fenxiang, the South Avenue, Xi'an, Shaanxi 710002, China.

出版信息

iScience. 2024 Dec 19;28(1):111647. doi: 10.1016/j.isci.2024.111647. eCollection 2025 Jan 17.

Abstract

Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering.

摘要

噬菌体越来越被视为治疗抗生素耐药性细菌感染的一种有前景的替代方法。然而,宿主范围的多样性使目标噬菌体的鉴定变得复杂。现有的计算工具常常无法准确识别不同细菌物种中的噬菌体。在本研究中,我们提出了GE-PHI,这是一种基于机器学习的模型,通过将知识图谱嵌入算法与大规模蛋白质语言模型相结合来预测噬菌体-宿主相互作用(PHIs)。首先,构建了一个噬菌体-宿主异质关联网络(PHAN),该网络纳入了噬菌体-噬菌体和宿主-宿主相似性网络。然后,使用多关系庞加莱图嵌入(MuRP)来提取拓扑模式。此外,我们采用ESM-2蛋白质语言模型从噬菌体尾部蛋白和宿主受体结合蛋白中捕获进化信息。GE-PHI在计算机模拟中实现了高达0.9453的交叉验证曲线下面积(AUC),并在案例研究中保持了这一性能。本研究为微生物工程中机器学习引导的噬菌体治疗和诊断提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2807/11761876/1351075746f1/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验