Suppr超能文献

PICNIC能够准确预测形成凝聚物的蛋白质,无论其在不同生物体中的结构无序状态如何。

PICNIC accurately predicts condensate-forming proteins regardless of their structural disorder across organisms.

作者信息

Hadarovich Anna, Singh Hari Raj, Ghosh Soumyadeep, Scheremetjew Maxim, Rostam Nadia, Hyman Anthony A, Toth-Petroczy Agnes

机构信息

Max Planck Institute of Molecular Cell Biology and Genetics, 01307, Dresden, Germany.

Center for Systems Biology Dresden, 01307, Dresden, Germany.

出版信息

Nat Commun. 2024 Dec 11;15(1):10668. doi: 10.1038/s41467-024-55089-x.

Abstract

Biomolecular condensates are membraneless organelles that can concentrate hundreds of different proteins in cells to operate essential biological functions. However, accurate identification of their components remains challenging and biased towards proteins with high structural disorder content with focus on self-phase separating (driver) proteins. Here, we present a machine learning algorithm, PICNIC (Proteins Involved in CoNdensates In Cells) to classify proteins that localize to biomolecular condensates regardless of their role in condensate formation. PICNIC successfully predicts condensate members by learning amino acid patterns in the protein sequence and structure in addition to the intrinsic disorder. Extensive experimental validation of 24 positive predictions in cellulo shows an overall ~82% accuracy regardless of the structural disorder content of the tested proteins. While increasing disorder content is associated with organismal complexity, our analysis of 26 species reveals no correlation between predicted condensate proteome content and disorder content across organisms. Overall, we present a machine learning classifier to interrogate condensate components at whole-proteome levels across the tree of life.

摘要

生物分子凝聚物是无膜细胞器,可在细胞中浓缩数百种不同蛋白质以执行基本生物学功能。然而,准确识别其成分仍然具有挑战性,并且偏向于具有高结构无序含量的蛋白质,重点是自相分离(驱动)蛋白。在此,我们提出一种机器学习算法PICNIC(参与细胞凝聚物的蛋白质),用于对定位于生物分子凝聚物的蛋白质进行分类,而不管它们在凝聚物形成中的作用如何。PICNIC除了通过内在无序外,还通过学习蛋白质序列和结构中的氨基酸模式,成功地预测了凝聚物成员。对24个阳性预测进行的广泛细胞内实验验证表明,无论测试蛋白质的结构无序含量如何,总体准确率约为82%。虽然无序含量的增加与生物体复杂性相关,但我们对26个物种的分析表明,预测的凝聚物蛋白质组含量与生物体间的无序含量之间没有相关性。总体而言,我们提出了一种机器学习分类器,用于在整个生命树的全蛋白质组水平上探究凝聚物成分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1704/11634905/499e8d492040/41467_2024_55089_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验