• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

超级复合体:一种用于蛋白质相互作用网络中分子复合体检测的监督式机器学习管道。

Super.Complex: A supervised machine learning pipeline for molecular complex detection in protein-interaction networks.

作者信息

Palukuri Meghana V, Marcotte Edward M

机构信息

Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas, USA.

Department of Molecular Biosciences, Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas, USA.

出版信息

bioRxiv. 2021 Oct 11:2021.06.22.449395. doi: 10.1101/2021.06.22.449395.

DOI:10.1101/2021.06.22.449395
PMID:34189530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8240683/
Abstract

Characterization of protein complexes, . sets of proteins assembling into a single larger physical entity, is important, as such assemblies play many essential roles in cells such as gene regulation. From networks of protein-protein interactions, potential protein complexes can be identified computationally through the application of community detection methods, which flag groups of entities interacting with each other in certain patterns. Most community detection algorithms tend to be unsupervised and assume that communities are dense network subgraphs, which is not always true, as protein complexes can exhibit diverse network topologies. The few existing supervised machine learning methods are serial and can potentially be improved in terms of accuracy and scalability by using better-suited machine learning models and parallel algorithms. Here, we present Super.Complex, a distributed, supervised AutoML-based pipeline for overlapping community detection in weighted networks. We also propose three new evaluation measures for the outstanding issue of comparing sets of learned and known communities satisfactorily. Super.Complex learns a community fitness function from known communities using an AutoML method and applies this fitness function to detect new communities. A heuristic local search algorithm finds maximally scoring communities, and a parallel implementation can be run on a computer cluster for scaling to large networks. On a yeast protein-interaction network, Super.Complex outperforms 6 other supervised and 4 unsupervised methods. Application of Super.Complex to a human protein-interaction network with ~8k nodes and ~60k edges yields 1,028 protein complexes, with 234 complexes linked to SARS-CoV-2, the COVID-19 virus, with 111 uncharacterized proteins present in 103 learned complexes. Super.Complex is generalizable with the ability to improve results by incorporating domain-specific features. Learned community characteristics can also be transferred from existing applications to detect communities in a new application with no known communities. Code and interactive visualizations of learned human protein complexes are freely available at: https://sites.google.com/view/supercomplex/super-complex-v3-0.

摘要

蛋白质复合物(即组装成单个更大物理实体的一组蛋白质)的表征很重要,因为此类组装在细胞中发挥着许多重要作用,如基因调控。从蛋白质 - 蛋白质相互作用网络中,可以通过应用社区检测方法来计算识别潜在的蛋白质复合物,这些方法会标记以特定模式相互作用的实体组。大多数社区检测算法往往是无监督的,并假设社区是密集的网络子图,但实际并非总是如此,因为蛋白质复合物可以呈现出多样的网络拓扑结构。现有的少数监督机器学习方法是串行的,通过使用更合适的机器学习模型和并行算法,在准确性和可扩展性方面可能会有所改进。在此,我们展示了Super.Complex,这是一种用于加权网络中重叠社区检测的基于分布式、监督式自动机器学习的管道。我们还针对令人满意地比较学习到的社区集和已知社区集这一突出问题提出了三种新的评估措施。Super.Complex使用自动机器学习方法从已知社区中学习社区适应度函数,并应用此适应度函数来检测新的社区。一种启发式局部搜索算法可找到得分最高的社区,并且可以在计算机集群上运行并行实现以扩展到大型网络。在酵母蛋白质相互作用网络上,Super.Complex优于其他6种监督方法和4种无监督方法。将Super.Complex应用于一个具有约8000个节点和约60000条边的人类蛋白质相互作用网络,产生了1028个蛋白质复合物,其中234个复合物与SARS-CoV-2(即新冠病毒)相关,在103个学习到的复合物中存在111个未表征的蛋白质。Super.Complex具有通用性,能够通过纳入特定领域特征来改进结果。学习到的社区特征也可以从现有应用中转移,以在没有已知社区的新应用中检测社区。已学习的人类蛋白质复合物的代码和交互式可视化可在以下网址免费获取:https://sites.google.com/view/supercomplex/super-complex-v3-0 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/fb4e211fb73f/nihpp-2021.06.22.449395v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/f2a0a2873567/nihpp-2021.06.22.449395v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/e0cffbb07249/nihpp-2021.06.22.449395v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/2eac1bacda08/nihpp-2021.06.22.449395v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/cbf3a09e247c/nihpp-2021.06.22.449395v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/fb4e211fb73f/nihpp-2021.06.22.449395v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/f2a0a2873567/nihpp-2021.06.22.449395v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/e0cffbb07249/nihpp-2021.06.22.449395v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/2eac1bacda08/nihpp-2021.06.22.449395v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/cbf3a09e247c/nihpp-2021.06.22.449395v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5322/8506578/fb4e211fb73f/nihpp-2021.06.22.449395v2-f0005.jpg

相似文献

1
Super.Complex: A supervised machine learning pipeline for molecular complex detection in protein-interaction networks.超级复合体:一种用于蛋白质相互作用网络中分子复合体检测的监督式机器学习管道。
bioRxiv. 2021 Oct 11:2021.06.22.449395. doi: 10.1101/2021.06.22.449395.
2
Super.Complex: A supervised machine learning pipeline for molecular complex detection in protein-interaction networks.超级复合物:用于蛋白质相互作用网络中分子复合物检测的有监督机器学习管道。
PLoS One. 2021 Dec 31;16(12):e0262056. doi: 10.1371/journal.pone.0262056. eCollection 2021.
3
Molecular complex detection in protein interaction networks through reinforcement learning.通过强化学习在蛋白质相互作用网络中检测分子复合物。
BMC Bioinformatics. 2023 Aug 2;24(1):306. doi: 10.1186/s12859-023-05425-7.
4
Protein complex detection with semi-supervised learning in protein interaction networks.利用蛋白质相互作用网络中的半监督学习检测蛋白质复合物。
Proteome Sci. 2011 Oct 14;9 Suppl 1(Suppl 1):S5. doi: 10.1186/1477-5956-9-S1-S5.
5
Predicting protein complexes using a supervised learning method combined with local structural information.利用监督学习方法结合局部结构信息预测蛋白质复合物。
PLoS One. 2018 Mar 19;13(3):e0194124. doi: 10.1371/journal.pone.0194124. eCollection 2018.
6
Protein Complexes Detection Based on Semi-Supervised Network Embedding Model.基于半监督网络嵌入模型的蛋白质复合物检测。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):797-803. doi: 10.1109/TCBB.2019.2944809. Epub 2021 Apr 8.
7
Protein complex identification by supervised graph local clustering.通过监督式图局部聚类进行蛋白质复合物鉴定。
Bioinformatics. 2008 Jul 1;24(13):i250-8. doi: 10.1093/bioinformatics/btn164.
8
Identifying protein complexes based on node embeddings obtained from protein-protein interaction networks.基于从蛋白质-蛋白质相互作用网络中获得的节点嵌入来识别蛋白质复合物。
BMC Bioinformatics. 2018 Sep 21;19(1):332. doi: 10.1186/s12859-018-2364-2.
9
PC2P: parameter-free network-based prediction of protein complexes.PC2P:基于无参数网络的蛋白质复合物预测
Bioinformatics. 2021 Apr 9;37(1):73-81. doi: 10.1093/bioinformatics/btaa1089.
10
An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks.一种用于从蛋白质-蛋白质相互作用网络中检测蛋白质复合物的集成学习框架。
Front Genet. 2022 Feb 24;13:839949. doi: 10.3389/fgene.2022.839949. eCollection 2022.