Suppr超能文献

CPIELA:基于蛋白质序列和进化信息的集成学习方法对植物蛋白质-蛋白质相互作用的计算预测

CPIELA: Computational Prediction of Plant Protein-Protein Interactions by Ensemble Learning Approach From Protein Sequences and Evolutionary Information.

作者信息

Li Li-Ping, Zhang Bo, Cheng Li

机构信息

College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China.

Xinjiang Key Laboratory of Grassland Resources and Ecology, Urumqi, China.

出版信息

Front Genet. 2022 Mar 11;13:857839. doi: 10.3389/fgene.2022.857839. eCollection 2022.

Abstract

Identification and characterization of plant protein-protein interactions (PPIs) are critical in elucidating the functions of proteins and molecular mechanisms in a plant cell. Although experimentally validated plant PPIs data have become increasingly available in diverse plant species, the high-throughput techniques are usually expensive and labor-intensive. With the incredibly valuable plant PPIs data accumulating in public databases, it is progressively important to propose computational approaches to facilitate the identification of possible PPIs. In this article, we propose an effective framework for predicting plant PPIs by combining the position-specific scoring matrix (PSSM), local optimal-oriented pattern (LOOP), and ensemble rotation forest (ROF) model. Specifically, the plant protein sequence is firstly transformed into the PSSM, in which the protein evolutionary information is perfectly preserved. Then, the local textural descriptor LOOP is employed to extract texture variation features from PSSM. Finally, the ROF classifier is adopted to infer the potential plant PPIs. The performance of CPIELA is evaluated via cross-validation on three plant PPIs datasets: , , and . The experimental results demonstrate that the CPIELA method achieved the high average prediction accuracies of 98.63%, 98.09%, and 94.02%, respectively. To further verify the high performance of CPIELA, we also compared it with the other state-of-the-art methods on three gold standard datasets. The experimental results illustrate that CPIELA is efficient and reliable for predicting plant PPIs. It is anticipated that the CPIELA approach could become a useful tool for facilitating the identification of possible plant PPIs.

摘要

鉴定和表征植物蛋白质-蛋白质相互作用(PPI)对于阐明植物细胞中蛋白质的功能和分子机制至关重要。尽管在多种植物物种中,经过实验验证的植物PPI数据越来越多,但高通量技术通常成本高昂且 labor-intensive。随着公共数据库中积累了极其有价值的植物PPI数据,提出计算方法以促进可能的PPI的鉴定变得越来越重要。在本文中,我们提出了一个有效的框架,通过结合位置特异性评分矩阵(PSSM)、局部最优方向模式(LOOP)和集成旋转森林(ROF)模型来预测植物PPI。具体而言,首先将植物蛋白质序列转化为PSSM,其中完美保留了蛋白质进化信息。然后,使用局部纹理描述符LOOP从PSSM中提取纹理变化特征。最后,采用ROF分类器推断潜在的植物PPI。通过在三个植物PPI数据集上进行交叉验证来评估CPIELA的性能: , ,和 。实验结果表明,CPIELA方法分别实现了98.63%、98.09%和94.02%的高平均预测准确率。为了进一步验证CPIELA的高性能,我们还在三个金标准数据集上与其他现有最先进方法进行了比较。实验结果表明,CPIELA在预测植物PPI方面是高效且可靠的。预计CPIELA方法可能会成为促进鉴定可能的植物PPI的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d562/8963800/820b625e2516/fgene-13-857839-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验