Suppr超能文献

将大语言模型与几何深度模型相结合用于蛋白质表示。

Aligning large language models and geometric deep models for protein representation.

作者信息

Shu Dong, Duan Bingbing, Guo Kai, Zhou Kaixiong, Tang Jiliang, Du Mengnan

机构信息

Northwestern University, Computer Science Department, Evanston, IL 60201, USA.

University of Pittsburgh, Biological Sciences Department, Pittsburgh, PA 15260, USA.

出版信息

Patterns (N Y). 2025 Apr 11;6(5):101227. doi: 10.1016/j.patter.2025.101227. eCollection 2025 May 9.

Abstract

In this study, we explore the alignment of multimodal representations between large language models (LLMs) and geometric deep models (GDMs) in the protein domain. We comprehensively evaluate three LLMs with four protein-specialized GDMs. Our work examines alignment factors from both model and protein perspectives, identifying challenges in current alignment methodologies and proposing strategies to improve the alignment process. Experimental results reveal that GDMs incorporating both graph and 3D structural information align better with LLMs, larger LLMs demonstrate improved alignment capabilities, and protein rarity significantly impacts alignment performance. We also find that increasing GDM embedding dimensions, using two-layer projection heads, and fine-tuning LLMs on protein-specific data substantially enhance alignment quality. Last, we demonstrate that improved alignment correlates with better downstream performance and reduced hallucination in protein-focused multimodal LLMs.

摘要

在本研究中,我们探索了蛋白质领域中大型语言模型(LLMs)与几何深度模型(GDMs)之间多模态表示的对齐情况。我们用四个蛋白质专用的GDMs全面评估了三个LLMs。我们的工作从模型和蛋白质两个角度研究了对齐因素,确定了当前对齐方法中的挑战,并提出了改进对齐过程的策略。实验结果表明,结合了图和三维结构信息的GDMs与LLMs的对齐效果更好,更大的LLMs展示出了更强的对齐能力,并且蛋白质的稀有性显著影响对齐性能。我们还发现,增加GDM嵌入维度、使用双层投影头以及在蛋白质特定数据上对LLMs进行微调,可大幅提高对齐质量。最后,我们证明,改进的对齐与更好的下游性能以及蛋白质聚焦多模态LLMs中幻觉的减少相关。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc32/12142629/fc77e39f2146/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验