太乙：一个用于多种生物医学任务的双语精调大型语言模型。

Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks.

机构信息

School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.

出版信息

J Am Med Inform Assoc. 2024 Sep 1;31(9):1865-1874. doi: 10.1093/jamia/ocae037.

DOI:10.1093/jamia/ocae037

PMID:38422367

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11339499/

Abstract

OBJECTIVE

Most existing fine-tuned biomedical large language models (LLMs) focus on enhancing performance in monolingual biomedical question answering and conversation tasks. To investigate the effectiveness of the fine-tuned LLMs on diverse biomedical natural language processing (NLP) tasks in different languages, we present Taiyi, a bilingual fine-tuned LLM for diverse biomedical NLP tasks.

MATERIALS AND METHODS

We first curated a comprehensive collection of 140 existing biomedical text mining datasets (102 English and 38 Chinese datasets) across over 10 task types. Subsequently, these corpora were converted to the instruction data used to fine-tune the general LLM. During the supervised fine-tuning phase, a 2-stage strategy is proposed to optimize the model performance across various tasks.

RESULTS

Experimental results on 13 test sets, which include named entity recognition, relation extraction, text classification, and question answering tasks, demonstrate that Taiyi achieves superior performance compared to general LLMs. The case study involving additional biomedical NLP tasks further shows Taiyi's considerable potential for bilingual biomedical multitasking.

CONCLUSION

Leveraging rich high-quality biomedical corpora and developing effective fine-tuning strategies can significantly improve the performance of LLMs within the biomedical domain. Taiyi shows the bilingual multitasking capability through supervised fine-tuning. However, those tasks such as information extraction that are not generation tasks in nature remain challenging for LLM-based generative approaches, and they still underperform the conventional discriminative approaches using smaller language models.

摘要

目的

大多数现有的经过微调的生物医学大型语言模型（LLM）专注于提高单语生物医学问答和对话任务的性能。为了研究经过微调的 LLM 在不同语言的多种生物医学自然语言处理（NLP）任务中的有效性，我们提出了 Taiyi，这是一种用于多种生物医学 NLP 任务的双语经过微调的 LLM。

材料和方法

我们首先整理了一个包含 140 个现有生物医学文本挖掘数据集的综合集合（102 个英语数据集和 38 个中文数据集），涵盖了超过 10 种任务类型。随后，这些语料库被转换为用于微调通用 LLM 的指令数据。在监督微调阶段，提出了一种 2 阶段策略来优化跨各种任务的模型性能。

结果

在包括命名实体识别、关系提取、文本分类和问答任务在内的 13 个测试集上的实验结果表明，与通用 LLM 相比，Taiyi 实现了卓越的性能。涉及额外的生物医学 NLP 任务的案例研究进一步表明，Taiyi 在双语生物医学多任务方面具有相当大的潜力。

结论

利用丰富的高质量生物医学语料库并开发有效的微调策略可以显著提高生物医学领域内的 LLM 性能。Taiyi 通过监督微调展示了双语多任务能力。然而，对于基于 LLM 的生成方法来说，那些本质上不是生成任务的任务，如信息提取，仍然具有挑战性，并且它们仍然不如使用较小语言模型的传统判别方法表现出色。

相似文献

Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks.太乙：一个用于多种生物医学任务的双语精调大型语言模型。

J Am Med Inform Assoc. 2024 Sep 1;31(9):1865-1874. doi: 10.1093/jamia/ocae037.

BioInstruct: instruction tuning of large language models for biomedical natural language processing.BioInstruct：用于生物医学自然语言处理的大型语言模型的指令调整。

J Am Med Inform Assoc. 2024 Sep 1;31(9):1821-1832. doi: 10.1093/jamia/ocae122.

Evaluating the effectiveness of biomedical fine-tuning for large language models on clinical tasks.评估生物医学微调对大语言模型在临床任务上的有效性。

J Am Med Inform Assoc. 2025 Jun 1;32(6):1015-1024. doi: 10.1093/jamia/ocaf045.

Improving entity recognition using ensembles of deep learning and fine-tuned large language models: A case study on adverse event extraction from VAERS and social media.使用深度学习集成和微调大语言模型改进实体识别：以从VAERS和社交媒体中提取不良事件为例

J Biomed Inform. 2025 Mar;163:104789. doi: 10.1016/j.jbi.2025.104789. Epub 2025 Feb 7.

A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks.对基准生物医学文本处理任务中大型语言模型的全面评估。

Comput Biol Med. 2024 Mar;171:108189. doi: 10.1016/j.compbiomed.2024.108189. Epub 2024 Feb 20.

Advancing entity recognition in biomedicine via instruction tuning of large language models.通过指令调整大型语言模型推进生物医学中的实体识别。

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae163.

Improving unified information extraction in Chinese mental health domain with instruction-tuned LLMs and type-verification component.使用指令微调的语言模型和类型验证组件改进中文心理健康领域的统一信息提取

Artif Intell Med. 2025 Apr;162:103087. doi: 10.1016/j.artmed.2025.103087. Epub 2025 Feb 19.

Use of SNOMED CT in Large Language Models: Scoping Review.SNOMED CT 在大语言模型中的应用：范围综述。

JMIR Med Inform. 2024 Oct 7;12:e62924. doi: 10.2196/62924.

LEAP: LLM instruction-example adaptive prompting framework for biomedical relation extraction.LEAP：用于生物医学关系抽取的 LLM 指令-示例自适应提示框架。

J Am Med Inform Assoc. 2024 Sep 1;31(9):2010-2018. doi: 10.1093/jamia/ocae147.

Optimizing biomedical information retrieval with a keyword frequency-driven prompt enhancement strategy.基于关键词频率驱动的提示增强策略优化生物医学信息检索

BMC Bioinformatics. 2024 Aug 27;25(1):281. doi: 10.1186/s12859-024-05902-7.

引用本文的文献

Advancing medical question answering with a knowledge embedding transformer.利用知识嵌入变压器推进医学问答

PLoS One. 2025 Aug 18;20(8):e0329606. doi: 10.1371/journal.pone.0329606. eCollection 2025.

Improvement of English-Chinese bilingual learning by integrating semantic analysis and neural machine translation.通过整合语义分析和神经机器翻译改进英汉双语学习。

Sci Rep. 2025 Jul 27;15(1):27319. doi: 10.1038/s41598-025-12614-2.

BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning.BioMistral-NLU：通过指令微调实现更具通用性的医学语言理解

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:149-158. eCollection 2025.

Large Language Models in Integrative Medicine: Progress, Challenges, and Opportunities.整合医学中的大语言模型：进展、挑战与机遇

J Evid Based Med. 2025 Jun;18(2):e70031. doi: 10.1111/jebm.70031.

Extracting Multifaceted Characteristics of Patients With Chronic Disease Comorbidity: Framework Development Using Large Language Models.提取慢性病合并症患者的多方面特征：使用大语言模型进行框架开发

JMIR Med Inform. 2025 May 15;13:e70096. doi: 10.2196/70096.

The Development Landscape of Large Language Models for Biomedical Applications.用于生物医学应用的大语言模型的发展态势

Annu Rev Biomed Data Sci. 2025 Aug;8(1):251-274. doi: 10.1146/annurev-biodatasci-102224-074736. Epub 2025 Apr 1.

Knowledge graph-based thought: a knowledge graph-enhanced LLM framework for pan-cancer question answering.基于知识图谱的思考：一种用于泛癌问答的知识图谱增强语言模型框架

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae082.

Open challenges and opportunities in federated foundation models towards biomedical healthcare.联合基础模型在生物医学医疗保健领域面临的公开挑战与机遇。

BioData Min. 2025 Jan 4;18(1):2. doi: 10.1186/s13040-024-00414-9.

Large language model to multimodal large language model: A journey to shape the biological macromolecules to biological sciences and medicine.从大语言模型到多模态大语言模型：塑造生物大分子以服务生物科学与医学的征程。

Mol Ther Nucleic Acids. 2024 Jun 15;35(3):102255. doi: 10.1016/j.omtn.2024.102255. eCollection 2024 Sep 10.

Large language models in biomedicine and health: current research landscape and future directions.生物医学与健康领域的大语言模型：当前研究现状与未来方向

J Am Med Inform Assoc. 2024 Sep 1;31(9):1801-1811. doi: 10.1093/jamia/ocae202.

本文引用的文献

Opportunities and challenges for ChatGPT and large language models in biomedicine and health.ChatGPT 和大型语言模型在生物医学和健康领域的机遇与挑战。

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad493.

An extensive benchmark study on biomedical text generation and mining with ChatGPT.一项关于使用ChatGPT进行生物医学文本生成和挖掘的广泛基准研究。

Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad557.

ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge.ChatDoctor：一种基于医学领域知识对大型语言模型Meta-AI（LLaMA）进行微调的医学聊天模型。

Cureus. 2023 Jun 24;15(6):e40895. doi: 10.7759/cureus.40895. eCollection 2023 Jun.

Large language models in medicine.医学中的大型语言模型。

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

BioGPT: generative pre-trained transformer for biomedical text generation and mining.BioGPT：用于生物医学文本生成和挖掘的生成式预训练转换器。

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations.生物医学文献的多标签分类：BioCreative VII LitCovid 新冠文献主题标注挑战赛概述。

Database (Oxford). 2022 Aug 31;2022. doi: 10.1093/database/baac069.

BioRED: a rich biomedical relation extraction dataset.BioRED：一个丰富的生物医学关系抽取数据集。

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac282.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT：一种用于生物医学文本挖掘的预训练生物医学语言表示模型。

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

BioCreative V CDR task corpus: a resource for chemical disease relation extraction.生物创意V化学疾病关系提取任务语料库：化学疾病关系提取的资源。

Database (Oxford). 2016 May 9;2016. doi: 10.1093/database/baw068. Print 2016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验