Volzhenin Konstantin, Bittner Lucie, Carbone Alessandra
Sorbonne Université, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.
Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum national d'Histoire naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France.
iScience. 2024 Jun 25;27(7):110371. doi: 10.1016/j.isci.2024.110371. eCollection 2024 Jul 19.
computational reconstructions of protein-protein interaction (PPI) networks will provide invaluable insights into cellular systems, enabling the discovery of novel molecular interactions and elucidating biological mechanisms within and between organisms. Leveraging the latest generation protein language models and recurrent neural networks, we present SENSE-PPI, a sequence-based deep learning model that efficiently reconstructs PPIs, distinguishing partners among tens of thousands of proteins and identifying specific interactions within functionally similar proteins. SENSE-PPI demonstrates high accuracy, limited training requirements, and versatility in cross-species predictions, even with non-model organisms and human-virus interactions. Its performance decreases for phylogenetically more distant model and non-model organisms, but signal alteration is very slow. In this regard, it demonstrates the important role of parameters in protein language models. SENSE-PPI is very fast and can test 10,000 proteins against themselves in a matter of hours, enabling the reconstruction of genome-wide proteomes.
蛋白质-蛋白质相互作用(PPI)网络的计算重建将为细胞系统提供宝贵的见解,有助于发现新的分子相互作用,并阐明生物体内和生物体之间的生物学机制。利用最新一代蛋白质语言模型和循环神经网络,我们提出了SENSE-PPI,这是一种基于序列的深度学习模型,能够有效地重建PPI,在数以万计的蛋白质中区分相互作用伙伴,并识别功能相似蛋白质之间的特定相互作用。SENSE-PPI在跨物种预测中表现出高精度、有限的训练要求和通用性,即使是针对非模式生物和人类-病毒相互作用。对于系统发育上距离更远的模式生物和非模式生物,其性能会下降,但信号变化非常缓慢。在这方面,它证明了参数在蛋白质语言模型中的重要作用。SENSE-PPI速度非常快,能够在数小时内对10000种蛋白质进行自我测试,从而实现全基因组蛋白质组的重建。