基于机器学习的蛋白质-蛋白质相互作用预测技术。

Machine-learning techniques for the prediction of protein-protein interactions.

机构信息

Division of Bioinformatics, Bose Institute, Kolkata, India.

出版信息

J Biosci. 2019 Sep;44(4).

DOI:
Abstract

Protein-protein interactions (PPIs) are important for the study of protein functions and pathways involved in different biological processes, as well as for understanding the cause and progression of diseases. Several high-throughput experimental techniques have been employed for the identification of PPIs in a few model organisms, but still, there is a huge gap in identifying all possible binary PPIs in an organism. Therefore, PPI prediction using machine-learning algorithms has been used in conjunction with experimental methods for discovery of novel protein interactions. The two most popular supervised machine-learning techniques used in the prediction of PPIs are support vector machines and random forest classifiers. Bayesian-probabilistic inference has also been used but mainly for the scoring of high-throughput PPI dataset confidence measures. Recently, deep-learning algorithms have been used for sequence-based prediction of PPIs. Several clustering methods such as hierarchical and k-means are useful as unsupervised machine-learning algorithms for the prediction of interacting protein pairs without explicit data labelling. In summary, machine-learning techniques have been widely used for the prediction of PPIs thus allowing experimental researchers to study cellular PPI networks.

摘要

蛋白质-蛋白质相互作用 (PPIs) 对于研究参与不同生物过程的蛋白质功能和途径以及了解疾病的原因和进展非常重要。已经采用了几种高通量实验技术来鉴定少数模式生物中的 PPIs,但在鉴定生物体中所有可能的二元 PPIs 方面仍存在巨大差距。因此,使用机器学习算法进行 PPI 预测已与实验方法结合使用,以发现新的蛋白质相互作用。预测 PPIs 中使用的两种最流行的监督机器学习技术是支持向量机和随机森林分类器。贝叶斯概率推理也已被使用,但主要用于评分高通量 PPI 数据集置信度度量。最近,深度学习算法已被用于基于序列的 PPIs 预测。层次聚类和 K 均值等聚类方法作为无监督机器学习算法非常有用,可用于预测没有显式数据标记的相互作用蛋白对。总之,机器学习技术已被广泛用于 PPI 的预测,从而使实验研究人员能够研究细胞 PPI 网络。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索