U Kaicheng, Zhang Sophia Meixuan, Pokharel Suresh, Pratyush Pawel, Qaderi Farah, Liu Dongfang, Zhao Junhan, Kc Dukka B, Chen Siwei
Tri-Institutional Computational Biology & Medicine, Weill Cornell Medicine, New York, NY, USA.
Department of Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Methods Mol Biol. 2025;2941:243-267. doi: 10.1007/978-1-0716-4623-6_15.
Protein-protein interactions (PPIs) are involved in nearly all biological processes. Understanding and analysis of PPI is key to revealing biological networks and identifying new therapeutic targets. Various computational approaches have been proposed as an alternative to the experimental investigation of PPIs. More recently, with the advent of Large Language Models (LLMs), a plethora of approaches using LLMs have been developed, enabling efficient analysis of interaction networks and binding sites directly from protein sequences. These models capture intricate biological patterns, offering scalability and adaptability across diverse datasets. However, challenges remain, including computational costs, data imbalance, and the integration of multimodal information. Advancements in addressing these limitations are set to further enhance the potential of LLMs in protein-protein interaction analysis, driving deeper insights and broader applications in biological research.
蛋白质-蛋白质相互作用(PPI)几乎涉及所有生物过程。对PPI的理解和分析是揭示生物网络和识别新治疗靶点的关键。已经提出了各种计算方法作为PPI实验研究的替代方法。最近,随着大语言模型(LLM)的出现,已经开发了大量使用LLM的方法,能够直接从蛋白质序列高效分析相互作用网络和结合位点。这些模型捕捉复杂的生物模式,在不同数据集上具有可扩展性和适应性。然而,挑战仍然存在,包括计算成本、数据不平衡以及多模态信息的整合。在解决这些限制方面的进展将进一步提高LLM在蛋白质-蛋白质相互作用分析中的潜力,推动在生物学研究中获得更深入的见解和更广泛的应用。