Suppr超能文献

基于大语言模型的自然语言编码可能是药物生物医学关联预测所需的一切。

Large Language Model-Based Natural Language Encoding Could Be All You Need for Drug Biomedical Association Prediction.

作者信息

Zhang Hanyu, Zhou Yuan, Zhang Zhichao, Sun Huaicheng, Pan Ziqi, Mou Minjie, Zhang Wei, Ye Qing, Hou Tingjun, Li Honglin, Hsieh Chang-Yu, Zhu Feng

机构信息

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China.

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

出版信息

Anal Chem. 2024 Jul 16. doi: 10.1021/acs.analchem.4c01793.

Abstract

Analyzing drug-related interactions in the field of biomedicine has been a critical aspect of drug discovery and development. While various artificial intelligence (AI)-based tools have been proposed to analyze drug biomedical associations (DBAs), their feature encoding did not adequately account for crucial biomedical functions and semantic concepts, thereby still hindering their progress. Since the advent of ChatGPT by OpenAI in 2022, large language models (LLMs) have demonstrated rapid growth and significant success across various applications. Herein, LEDAP was introduced, which uniquely leveraged LLM-based biotext feature encoding for predicting drug-disease associations, drug-drug interactions, and drug-side effect associations. Benefiting from the large-scale knowledgebase pre-training, LLMs had great potential in drug development analysis owing to their holistic understanding of natural language and human topics. LEDAP illustrated its notable competitiveness in comparison with other popular DBA analysis tools. Specifically, even in simple conjunction with classical machine learning methods, LLM-based feature representations consistently enabled satisfactory performance across diverse DBA tasks like binary classification, multiclass classification, and regression. Our findings underpinned the considerable potential of LLMs in drug development research, indicating a catalyst for further progress in related fields.

摘要

分析生物医药领域中与药物相关的相互作用一直是药物发现和开发的关键环节。虽然已经提出了各种基于人工智能(AI)的工具来分析药物生物医学关联(DBA),但其特征编码未能充分考虑关键的生物医学功能和语义概念,从而仍然阻碍了它们的发展。自2022年OpenAI推出ChatGPT以来,大语言模型(LLM)在各种应用中展现出快速增长并取得了显著成功。在此,引入了LEDAP,它独特地利用基于LLM的生物文本特征编码来预测药物-疾病关联、药物-药物相互作用和药物-副作用关联。受益于大规模知识库预训练,LLM由于对自然语言和人类主题的整体理解,在药物开发分析中具有巨大潜力。与其他流行的DBA分析工具相比,LEDAP展现出显著的竞争力。具体而言,即使仅与经典机器学习方法简单结合,基于LLM的特征表示在诸如二元分类、多类分类和回归等各种DBA任务中始终能实现令人满意的性能。我们的研究结果证实了LLM在药物开发研究中的巨大潜力,表明它是相关领域进一步发展的催化剂。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验