大语言模型的检索增强用于生成通俗语言。

Retrieval augmentation of large language models for lay language generation.

机构信息

Biomedical and Health Informatics, University of Washington, United States of America.

Paul G. Allen School of Computer Science, University of Washington, United States of America.

出版信息

J Biomed Inform. 2024 Jan;149:104580. doi: 10.1016/j.jbi.2023.104580. Epub 2023 Dec 30.

DOI:10.1016/j.jbi.2023.104580

PMID:38163514

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10874606/

Abstract

The complex linguistic structures and specialized terminology of expert-authored content limit the accessibility of biomedical literature to the general public. Automated methods have the potential to render this literature more interpretable to readers with different educational backgrounds. Prior work has framed such lay language generation as a summarization or simplification task. However, adapting biomedical text for the lay public includes the additional and distinct task of background explanation: adding external content in the form of definitions, motivation, or examples to enhance comprehensibility. This task is especially challenging because the source document may not include the required background knowledge. Furthermore, background explanation capabilities have yet to be formally evaluated, and little is known about how best to enhance them. To address this problem, we introduce Retrieval-Augmented Lay Language (RALL) generation, which intuitively fits the need for external knowledge beyond that in expert-authored source documents. In addition, we introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation. To evaluate RALL, we augmented state-of-the-art text generation models with information retrieval of either term definitions from the UMLS and Wikipedia, or embeddings of explanations from Wikipedia documents. Of these, embedding-based RALL models improved summary quality and simplicity while maintaining factual correctness, suggesting that Wikipedia is a helpful source for background explanation in this context. We also evaluated the ability of both an open-source Large Language Model (Llama 2) and a closed-source Large Language Model (GPT-4) in background explanation, with and without retrieval augmentation. Results indicate that these LLMs can generate simplified content, but that the summary quality is not ideal. Taken together, this work presents the first comprehensive study of background explanation for lay language generation, paving the path for disseminating scientific knowledge to a broader audience. Our code and data are publicly available at: https://github.com/LinguisticAnomalies/pls_retrieval.

摘要

专家撰写的内容具有复杂的语言结构和专业术语，这使得生物医学文献对普通大众的可理解性受到限制。自动化方法有可能使读者更容易理解具有不同教育背景的读者。先前的工作将这种向非专业读者生成通俗语言的方法框架定义为一种总结或简化任务。然而，为非专业读者改编生物医学文本包括一个额外且独特的背景解释任务：以定义、动机或示例的形式添加外部内容，以提高可理解性。这项任务特别具有挑战性，因为源文档可能不包含所需的背景知识。此外，背景解释能力尚未得到正式评估，并且对于如何最好地增强这些能力知之甚少。为了解决这个问题，我们引入了检索增强通俗语言（RALL）生成，它直观地满足了对专家撰写的源文档之外的外部知识的需求。此外，我们引入了 CELLS，这是最大的（63k 对）和最广泛的（12 种期刊）通俗语言生成平行语料库。为了评估 RALL，我们使用 UMLS 和维基百科的术语定义信息检索，或维基百科文档的解释嵌入，为最先进的文本生成模型提供增强信息。在这些方法中，基于嵌入的 RALL 模型在保持事实正确性的同时，提高了摘要的质量和简洁性，这表明在这种情况下，维基百科是一个有助于背景解释的有用来源。我们还评估了开源大语言模型（Llama 2）和闭源大语言模型（GPT-4）在背景解释方面的能力，包括有检索增强和无检索增强的情况。结果表明，这些大语言模型可以生成简化的内容，但摘要的质量并不理想。总之，这项工作首次全面研究了通俗语言生成的背景解释，为向更广泛的受众传播科学知识铺平了道路。我们的代码和数据可在以下网址获得：https://github.com/LinguisticAnomalies/pls_retrieval。

相似文献

Retrieval augmentation of large language models for lay language generation.

J Biomed Inform. 2024 Jan;149:104580. doi: 10.1016/j.jbi.2023.104580. Epub 2023 Dec 30.

Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models.

Bioinformatics. 2024 Jun 28;40(Suppl 1):i119-i129. doi: 10.1093/bioinformatics/btae238.

A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Biomedical knowledge graph-optimized prompt generation for large language models.

Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae560.

SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.

Artif Intell Med. 2020 Jan;102:101767. doi: 10.1016/j.artmed.2019.101767. Epub 2019 Nov 28.

Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases.

BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):65. doi: 10.1186/s12911-018-0630-x.

Ascle-A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study.

J Med Internet Res. 2024 Oct 3;26:e60601. doi: 10.2196/60601.

Semantic annotation for concept-based cross-language medical information retrieval.

Int J Med Inform. 2002 Dec 4;67(1-3):97-112. doi: 10.1016/s1386-5056(02)00058-8.

On the role of the UMLS in supporting diagnosis generation proposed by Large Language Models.

J Biomed Inform. 2024 Sep;157:104707. doi: 10.1016/j.jbi.2024.104707. Epub 2024 Aug 13.

Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature.

Pac Symp Biocomput. 2024;29:8-23.

引用本文的文献

Feasibility of Automated Precharting using GPT-4 in New Specialty Referrals.

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:312-321. eCollection 2025.

Retrieval augmented generation for large language models in healthcare: A systematic review.

PLOS Digit Health. 2025 Jun 11;4(6):e0000877. doi: 10.1371/journal.pdig.0000877. eCollection 2025 Jun.

Deep Learning in Digital Breast Tomosynthesis: Current Status, Challenges, and Future Trends.

MedComm (2020). 2025 Jun 9;6(6):e70247. doi: 10.1002/mco2.70247. eCollection 2025 Jun.

Clinical insights: A comprehensive review of language models in medicine.

PLOS Digit Health. 2025 May 8;4(5):e0000800. doi: 10.1371/journal.pdig.0000800. eCollection 2025 May.

Thyro-GenAI: A Chatbot Using Retrieval-Augmented Generative Models for Personalized Thyroid Disease Management.

J Clin Med. 2025 Apr 3;14(7):2450. doi: 10.3390/jcm14072450.

Year 2023 in Biomedical Natural Language Processing: a Tribute to Large Language Models and Generative AI.

Yearb Med Inform. 2024 Aug;33(1):241-248. doi: 10.1055/s-0044-1800751. Epub 2025 Apr 8.

A Narrative Review on the Application of Large Language Models to Support Cancer Care and Research.

Yearb Med Inform. 2024 Aug;33(1):90-98. doi: 10.1055/s-0044-1800726. Epub 2025 Apr 8.

APPLS: Evaluating Evaluation Metrics for Plain Language Summarization.

Proc Conf Empir Methods Nat Lang Process. 2024 Nov;2024:9194-9211. doi: 10.18653/v1/2024.emnlp-main.519.

From statistics to deep learning: Using large language models in psychiatric research.

Int J Methods Psychiatr Res. 2025 Mar;34(1):e70007. doi: 10.1002/mpr.70007.

Coherence and comprehensibility: Large language models predict lay understanding of health-related content.

J Biomed Inform. 2025 Jan;161:104758. doi: 10.1016/j.jbi.2024.104758. Epub 2024 Dec 9.

本文引用的文献

A dataset for plain language adaptation of biomedical abstracts.

Sci Data. 2023 Jan 4;10(1):8. doi: 10.1038/s41597-022-01920-3.

Comparison of Readability Scores for Written Health Information Across Formulas Using Automated vs Manual Measures.

JAMA Netw Open. 2022 Dec 1;5(12):e2246051. doi: 10.1001/jamanetworkopen.2022.46051.

Evaluating Factuality in Text Simplification.

Proc Conf Assoc Comput Linguist Meet. 2022 May;2022:7331-7345. doi: 10.18653/v1/2022.acl-long.506.

Graph-based abstractive biomedical text summarization.

J Biomed Inform. 2022 Aug;132:104099. doi: 10.1016/j.jbi.2022.104099. Epub 2022 Jun 11.

Paragraph-level Simplification of Medical Texts.

Proc Conf. 2021 Jun;2021:4972-4984. doi: 10.18653/v1/2021.naacl-main.395.

COVIDSum: A linguistically enriched SciBERT-based summarization model for COVID-19 scientific papers.

J Biomed Inform. 2022 Mar;127:103999. doi: 10.1016/j.jbi.2022.103999. Epub 2022 Jan 30.

Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization.

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:605-614. eCollection 2021.

COVID-19 Misinformation Online and Health Literacy: A Brief Overview.

Int J Environ Res Public Health. 2021 Jul 30;18(15):8091. doi: 10.3390/ijerph18158091.

A systematic review of automatic text summarization for biomedical literature and EHRs.

J Am Med Inform Assoc. 2021 Sep 18;28(10):2287-2297. doi: 10.1093/jamia/ocab143.

From information seeking to information avoidance: Understanding the health information behavior during a global health crisis.

Inf Process Manag. 2021 Mar;58(2):102440. doi: 10.1016/j.ipm.2020.102440. Epub 2020 Nov 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

大语言模型的检索增强用于生成通俗语言。

Retrieval augmentation of large language models for lay language generation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献