Suppr超能文献

基于子图拓扑特征的决策树分类器用于从大规模 PPI 网络中挖掘蛋白质复合物。

Decision tree classifier based on topological characteristics of subgraph for the mining of protein complexes from large scale PPI networks.

机构信息

Bioinformatics Lab, Department of Computer Science, IIIT, Bhubaneswar, India.

出版信息

Comput Biol Chem. 2023 Oct;106:107935. doi: 10.1016/j.compbiolchem.2023.107935. Epub 2023 Jul 25.

Abstract

The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational features of protein protein interaction (PPI) networks. Clustering PPI networks has proven useful in numerous research over the past two decades for identifying functional modules, understanding the roles of previously unknown proteins, and other purposes. Protein complexes represent one of the essential cellular components for creating biological activities. Inferring protein complexes has been made more accessible by experimental approaches. We offer a novel method that integrates the classification model with local topological data, making it more reliable and efficient. This article describes a decision tree classifier based on topological characteristics of the subgraph for mining protein complexes. The proposed graph-based algorithm is an effective and efficient way to identify protein complexes from large-scale PPI networks. The performance of the proposed algorithm is observed in protein-protein interaction networks of yeast and human in the Database of Interacting Proteins (DIP) and the Biological General Repository for Interaction Datasets (BioGRID) using widely accepted benchmark protein complexes from the comprehensive resource of mammalian protein complexes (CORUM) and the comprehensive catalogue of yeast protein complexes (CYC2008). The outcomes demonstrate that our method can outperform the best-performing supervised, semi-supervised, and unsupervised approaches to detecting protein complexes.

摘要

大规模蛋白质相互作用数据的可及性不断提高,这就需要广泛的研究来理解细胞组织及其在网络层面上的功能。生物信息学和数据挖掘研究人员广泛研究了网络聚类,以检查蛋白质-蛋白质相互作用(PPI)网络的结构和操作特征。在过去的二十年中,聚类 PPI 网络已被证明在许多研究中非常有用,可用于识别功能模块、了解先前未知蛋白质的作用以及其他目的。蛋白质复合物是产生生物活性的重要细胞成分之一。通过实验方法,蛋白质复合物的推断变得更加容易。我们提供了一种新的方法,将分类模型与局部拓扑数据集成,使其更加可靠和高效。本文描述了一种基于子图拓扑特征的决策树分类器,用于挖掘蛋白质复合物。所提出的基于图的算法是从大规模 PPI 网络中识别蛋白质复合物的一种有效且高效的方法。该算法的性能在酵母和人类的蛋白质-蛋白质相互作用网络中进行了观察,这些网络来自相互作用蛋白质数据库(DIP)和生物相互作用数据集综合资源(BioGRID),使用了哺乳动物蛋白质复合物综合资源(CORUM)和酵母蛋白质复合物综合目录(CYC2008)中广泛接受的基准蛋白质复合物。结果表明,我们的方法可以优于检测蛋白质复合物的最佳表现监督、半监督和无监督方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验