Song Zihao, Li Qingnuo, Zhao Jincheng, Bu Qinggang, Bian Zekang, Qu Jia
The School of Computer Science and Artificial Intelligence & Aliyun School of Big Data, Changzhou University, Changzhou, China.
The School of AI & Computer Science, Jiangnan University, Wuxi, China.
PeerJ. 2025 Aug 5;13:e19637. doi: 10.7717/peerj.19637. eCollection 2025.
Antibiotics play a critical role in treating microbial infections. However, their widespread use has contributed to the growing problem of microbial resistance. Addressing this challenge requires the identification of new microbe-drug associations to support the development of novel therapeutic strategies. Since traditional wet-lab experiments are time-consuming and costly, computational models offer an efficient alternative for discovering potential applications of existing drugs against previously untested microbes. These models can facilitate the identification of novel microbe-drug associations and help counteract resistance mechanisms.
This study proposes a novel computational model: convolutional neural network with Bernoulli random forest for Microbe-Drug Association prediction (CNNBRFMDA). The model constructs feature vectors for all microbe-drug pairs based on known associations, microbe similarity, and drug similarity. A subset of these vectors is randomly selected to form the training set. A convolutional neural network (CNN) is then used to reduce the dimensionality of all feature vectors, including those in the training set. The reduced training set is subsequently used to train a Bernoulli random forest (BRF) to predict potential microbe-drug associations. The innovation of CNNBRFMDA lies in its integration of CNN for nonlinear feature extraction and BRF for robust prediction. This approach enhances computational efficiency and improves the model's ability to capture complex patterns, thereby increasing the precision and interpretability of drug response predictions. The dual use of the Bernoulli distribution in BRF ensures algorithmic consistency and contributes to superior performance.
The model was evaluated using five-fold cross-validation on the Microbe-Drug Association Database (MDAD) and abiofilm datasets. CNNBRFMDA achieved mean AUC scores of 0.9017 ± 0.0032 (MDAD) and 0.9146 ± 0.0041 (abiofilm). Two case studies further validated the model's reliability: 41 of the top 50 predicted microbes associated with ciprofloxacin and 38 of the top 50 associated with moxifloxacin were confirmed through literature review.
抗生素在治疗微生物感染中发挥着关键作用。然而,它们的广泛使用导致了微生物耐药性这一日益严重的问题。应对这一挑战需要识别新的微生物 - 药物关联,以支持新型治疗策略的开发。由于传统的湿实验室实验既耗时又昂贵,计算模型为发现现有药物针对以前未测试的微生物的潜在应用提供了一种有效的替代方法。这些模型可以促进新型微生物 - 药物关联的识别,并有助于对抗耐药机制。
本研究提出了一种新型计算模型:用于微生物 - 药物关联预测的卷积神经网络与伯努利随机森林(CNNBRFMDA)。该模型基于已知关联、微生物相似性和药物相似性为所有微生物 - 药物对构建特征向量。随机选择这些向量的一个子集形成训练集。然后使用卷积神经网络(CNN)来降低所有特征向量的维度,包括训练集中的那些。随后使用降维后的训练集训练伯努利随机森林(BRF)来预测潜在的微生物 - 药物关联。CNNBRFMDA的创新之处在于其将用于非线性特征提取的CNN和用于稳健预测的BRF相结合。这种方法提高了计算效率,增强了模型捕捉复杂模式的能力,从而提高了药物反应预测的精度和可解释性。BRF中伯努利分布的双重使用确保了算法的一致性,并有助于实现卓越的性能。
该模型在微生物 - 药物关联数据库(MDAD)和生物膜数据集上使用五折交叉验证进行评估。CNNBRFMDA在MDAD上的平均AUC分数为0.9017±0.0032,在生物膜数据集上为0.9146±0.0041。两个案例研究进一步验证了该模型的可靠性:通过文献综述证实,与环丙沙星相关的前50种预测微生物中有41种,与莫西沙星相关的前50种中有38种。