Murmu Sneha, Chaurasia Himanshushekhar, Rao A R, Rai Anil, Jaiswal Sarika, Bharadwaj Anshu, Yadav Rajbir, Archak Sunil
ICAR-Indian Agricultural Statistics Research Institute, New Delhi 110012, India; ICAR-Indian Agricultural Research Institute, New Delhi 110012, India.
ICAR-Indian Agricultural Statistics Research Institute, New Delhi 110012, India; ICAR-Indian Agricultural Research Institute, New Delhi 110012, India; ICAR-Central Institute for Research on Cotton Technology, Mumbai 400019, India.
J Mol Biol. 2025 Aug 1;437(15):169093. doi: 10.1016/j.jmb.2025.169093. Epub 2025 Mar 17.
This study aimed to develop a machine learning-based tool for predicting protein-protein interactions (PPIs) between plant-pathogen systems, addressing the challenges of experimental PPI identification. Identifying PPIs in plant-pathogen interactions is crucial for understanding the molecular mechanisms underlying plant defense and pathogen virulence. However, experimental methods are time-consuming and labor-intensive, prompting the use of computational techniques to complement traditional approaches. A robust ensemble model was developed using multiple sequence encodings and diverse learning algorithms such as random forest, support vector machine, and artificial neural network. The features used included auto-covariance, conjoint triad, and local descriptor schemes, which were selected based on their performance. The top three performing models were combined into an ensemble model, improving prediction accuracy to approximately 97%. The PlantPathoPPI tool, developed through this approach, was compared with existing tools using an independent test dataset, showing promising potential for PPI prediction in plant-pathogen interactions. To facilitate broad accessibility, a web-based prediction server was developed, available at https://plantpathoppi.onrender.com/, alongside a Python package on https://pypi.org/project/plantpathoppi-ml/. This research contributes significantly to the field by offering an efficient tool for predicting PPIs in plant-pathogen systems, providing valuable insights into plant diseases and supporting hypothesis-driven research.
本研究旨在开发一种基于机器学习的工具,用于预测植物-病原体系统中的蛋白质-蛋白质相互作用(PPI),以应对实验性PPI鉴定的挑战。鉴定植物-病原体相互作用中的PPI对于理解植物防御和病原体毒力的分子机制至关重要。然而,实验方法既耗时又费力,这促使人们使用计算技术来补充传统方法。利用多种序列编码和随机森林、支持向量机、人工神经网络等多种学习算法,开发了一种强大的集成模型。所使用的特征包括自协方差、三联体结合和局部描述符方案,这些特征是根据其性能选择的。将表现最佳的三个模型组合成一个集成模型,将预测准确率提高到了约97%。通过这种方法开发的PlantPathoPPI工具,使用独立测试数据集与现有工具进行了比较,显示出在植物-病原体相互作用中进行PPI预测的潜力巨大。为了便于广泛使用,开发了一个基于网络的预测服务器,可在https://plantpathoppi.onrender.com/上获取,同时在https://pypi.org/project/plantpathoppi-ml/上提供了一个Python包。这项研究通过提供一种用于预测植物-病原体系统中PPI的高效工具,为该领域做出了重大贡献,为植物病害提供了有价值的见解,并支持了假设驱动的研究。