Li Li-Ping, Zhang Bo, Cheng Li
College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China.
Xinjiang Key Laboratory of Grassland Resources and Ecology, Urumqi, China.
Front Genet. 2022 Mar 11;13:857839. doi: 10.3389/fgene.2022.857839. eCollection 2022.
Identification and characterization of plant protein-protein interactions (PPIs) are critical in elucidating the functions of proteins and molecular mechanisms in a plant cell. Although experimentally validated plant PPIs data have become increasingly available in diverse plant species, the high-throughput techniques are usually expensive and labor-intensive. With the incredibly valuable plant PPIs data accumulating in public databases, it is progressively important to propose computational approaches to facilitate the identification of possible PPIs. In this article, we propose an effective framework for predicting plant PPIs by combining the position-specific scoring matrix (PSSM), local optimal-oriented pattern (LOOP), and ensemble rotation forest (ROF) model. Specifically, the plant protein sequence is firstly transformed into the PSSM, in which the protein evolutionary information is perfectly preserved. Then, the local textural descriptor LOOP is employed to extract texture variation features from PSSM. Finally, the ROF classifier is adopted to infer the potential plant PPIs. The performance of CPIELA is evaluated via cross-validation on three plant PPIs datasets: , , and . The experimental results demonstrate that the CPIELA method achieved the high average prediction accuracies of 98.63%, 98.09%, and 94.02%, respectively. To further verify the high performance of CPIELA, we also compared it with the other state-of-the-art methods on three gold standard datasets. The experimental results illustrate that CPIELA is efficient and reliable for predicting plant PPIs. It is anticipated that the CPIELA approach could become a useful tool for facilitating the identification of possible plant PPIs.
鉴定和表征植物蛋白质-蛋白质相互作用(PPI)对于阐明植物细胞中蛋白质的功能和分子机制至关重要。尽管在多种植物物种中,经过实验验证的植物PPI数据越来越多,但高通量技术通常成本高昂且 labor-intensive。随着公共数据库中积累了极其有价值的植物PPI数据,提出计算方法以促进可能的PPI的鉴定变得越来越重要。在本文中,我们提出了一个有效的框架,通过结合位置特异性评分矩阵(PSSM)、局部最优方向模式(LOOP)和集成旋转森林(ROF)模型来预测植物PPI。具体而言,首先将植物蛋白质序列转化为PSSM,其中完美保留了蛋白质进化信息。然后,使用局部纹理描述符LOOP从PSSM中提取纹理变化特征。最后,采用ROF分类器推断潜在的植物PPI。通过在三个植物PPI数据集上进行交叉验证来评估CPIELA的性能: , ,和 。实验结果表明,CPIELA方法分别实现了98.63%、98.09%和94.02%的高平均预测准确率。为了进一步验证CPIELA的高性能,我们还在三个金标准数据集上与其他现有最先进方法进行了比较。实验结果表明,CPIELA在预测植物PPI方面是高效且可靠的。预计CPIELA方法可能会成为促进鉴定可能的植物PPI的有用工具。