Wei Ankang, Zhan Huanghan, Xiao Zhen, Zhao Weizhong, Jiang Xingpeng
Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan 430079, China.
School of Computer Science, Central China Normal University, Wuhan 430079, China.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae708.
Bacterial resistance has emerged as one of the greatest threats to human health, and phages have shown tremendous potential in addressing the issue of drug-resistant bacteria by lysing host. The identification of phage-host interactions (PHI) is crucial for addressing bacterial infections. Some existing computational methods for predicting PHI are suboptimal in terms of prediction efficiency due to the limited types of available information. Despite the emergence of some supporting information, the generalizability of models using this information is limited by the small scale of the databases. Additionally, most existing models overlook the sparsity of association data, which severely impacts their predictive performance as well. In this study, we propose a dual-view sparse network model (DSPHI) to predict PHI, which leverages logical probability theory and network sparsification. Specifically, we first constructed similarity networks using the sequences of phages and hosts respectively, and then sparsified these networks, enabling the model to focus more on key information during the learning process, thereby improving prediction efficiency. Next, we utilize logical probability theory to compute high-order logical information between phages (hosts), which is known as mutual information. Subsequently, we connect this information in node form to the sparse phage (host) similarity network, resulting in a phage (host) heterogeneous network that better integrates the two information views, thereby reducing the complexity of model computation and enhancing information aggregation capabilities. The hidden features of phages and hosts are explored through graph learning algorithms. Experimental results demonstrate that mutual information is effective information in predicting PHI, and the sparsification procedure of similarity networks significantly improves the model's predictive performance.
细菌耐药性已成为对人类健康的最大威胁之一,而噬菌体在通过裂解宿主解决耐药菌问题方面显示出巨大潜力。噬菌体-宿主相互作用(PHI)的识别对于解决细菌感染至关重要。由于可用信息类型有限,一些现有的预测PHI的计算方法在预测效率方面并不理想。尽管出现了一些支持信息,但使用这些信息的模型的通用性受到数据库规模小的限制。此外,大多数现有模型忽略了关联数据的稀疏性,这也严重影响了它们的预测性能。在本研究中,我们提出了一种双视图稀疏网络模型(DSPHI)来预测PHI,该模型利用逻辑概率理论和网络稀疏化。具体而言,我们首先分别使用噬菌体和宿主的序列构建相似性网络,然后对这些网络进行稀疏化,使模型在学习过程中更关注关键信息,从而提高预测效率。接下来,我们利用逻辑概率理论计算噬菌体(宿主)之间的高阶逻辑信息,即互信息。随后,我们将此信息以节点形式连接到稀疏的噬菌体(宿主)相似性网络,形成一个更好地整合两种信息视图的噬菌体(宿主)异质网络,从而降低模型计算的复杂性并增强信息聚合能力。通过图学习算法探索噬菌体和宿主的隐藏特征。实验结果表明,互信息是预测PHI的有效信息,相似性网络的稀疏化过程显著提高了模型的预测性能。