Fischer Samantha N, Claussen Erin R, Kourtis Savvas, Sdelci Sara, Orchard Sandra, Hermjakob Henning, Kustatscher Georg, Drew Kevin
Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL 60607.
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
bioRxiv. 2024 Oct 15:2024.10.11.617930. doi: 10.1101/2024.10.11.617930.
Macromolecular protein complexes carry out most functions in the cell including essential functions required for cell survival. Unfortunately, we lack the subunit composition for all human protein complexes. To address this gap we integrated >25,000 mass spectrometry experiments using a machine learning approach to identify > 15,000 human protein complexes. We show our map of protein complexes is highly accurate and more comprehensive than previous maps, placing ~75% of human proteins into their physical contexts. We globally characterize our complexes using protein co-variation data (ProteomeHD.2) and identify co-varying complexes suggesting common functional associations. Our map also generates testable functional hypotheses for 472 uncharacterized proteins which we support using AlphaFold modeling. Additionally, we use AlphaFold modeling to identify 511 mutually exclusive protein pairs in hu.MAP3.0 complexes suggesting complexes serve different functional roles depending on their subunit composition. We identify expression as the primary way cells and organisms relieve the conflict of mutually exclusive subunits. Finally, we import our complexes to EMBL-EBI's Complex Portal (https://www.ebi.ac.uk/complexportal/home) as well as provide complexes through our hu.MAP3.0 web interface (https://humap3.proteincomplexes.org/). We expect our resource to be highly impactful to the broader research community.
大分子蛋白质复合物执行细胞中的大多数功能,包括细胞存活所需的基本功能。不幸的是,我们尚不清楚所有人类蛋白质复合物的亚基组成。为了填补这一空白,我们采用机器学习方法整合了超过25000个质谱实验,以识别超过15000个人类蛋白质复合物。我们展示了我们的蛋白质复合物图谱高度准确,且比以前的图谱更全面,将约75%的人类蛋白质置于其物理背景中。我们使用蛋白质共变数据(ProteomeHD.2)对我们的复合物进行全局表征,并识别出共变复合物,表明它们具有共同的功能关联。我们的图谱还为472个未表征的蛋白质生成了可测试的功能假设,我们使用AlphaFold建模对此进行了支持。此外,我们使用AlphaFold建模在hu.MAP3.0复合物中识别出511对互斥蛋白质对,这表明复合物根据其亚基组成发挥不同的功能作用。我们确定表达是细胞和生物体缓解互斥亚基冲突的主要方式。最后,我们将我们的复合物导入欧洲生物信息学研究所(EMBL-EBI)的复合物门户(https://www.ebi.ac.uk/complexportal/home),并通过我们的hu.MAP3.0网络界面(https://humap3.proteincomplexes.org/)提供复合物。我们预计我们的资源将对更广泛的研究群体产生重大影响。