使用强化学习的医学文本简化（TESLEA）：基于深度学习的文本简化方法。

Medical Text Simplification Using Reinforcement Learning (TESLEA): Deep Learning-Based Text Simplification Approach.

作者信息

Phatak Atharva, Savage David W, Ohle Robert, Smith Jonathan, Mago Vijay

机构信息

Department of Computer Science, Lakehead University, Thunder Bay, ON, Canada.

NOSM University, Thunder Bay, ON, Canada.

出版信息

JMIR Med Inform. 2022 Nov 18;10(11):e38095. doi: 10.2196/38095.

DOI:10.2196/38095

PMID:36399375

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9719064/

Abstract

BACKGROUND

In most cases, the abstracts of articles in the medical domain are publicly available. Although these are accessible by everyone, they are hard to comprehend for a wider audience due to the complex medical vocabulary. Thus, simplifying these complex abstracts is essential to make medical research accessible to the general public.

OBJECTIVE

This study aims to develop a deep learning-based text simplification (TS) approach that converts complex medical text into a simpler version while maintaining the quality of the generated text.

METHODS

A TS approach using reinforcement learning and transformer-based language models was developed. Relevance reward, Flesch-Kincaid reward, and lexical simplicity reward were optimized to help simplify jargon-dense complex medical paragraphs to their simpler versions while retaining the quality of the text. The model was trained using 3568 complex-simple medical paragraphs and evaluated on 480 paragraphs via the help of automated metrics and human annotation.

RESULTS

The proposed method outperformed previous baselines on Flesch-Kincaid scores (11.84) and achieved comparable performance with other baselines when measured using ROUGE-1 (0.39), ROUGE-2 (0.11), and SARI scores (0.40). Manual evaluation showed that percentage agreement between human annotators was more than 70% when factors such as fluency, coherence, and adequacy were considered.

CONCLUSIONS

A unique medical TS approach is successfully developed that leverages reinforcement learning and accurately simplifies complex medical paragraphs, thereby increasing their readability. The proposed TS approach can be applied to automatically generate simplified text for complex medical text data, which would enhance the accessibility of biomedical research to a wider audience.

摘要

背景

在大多数情况下，医学领域文章的摘要都是公开可用的。尽管每个人都可以获取这些摘要，但由于复杂的医学词汇，广大受众很难理解。因此，简化这些复杂的摘要对于让公众能够接触到医学研究至关重要。

目的

本研究旨在开发一种基于深度学习的文本简化（TS）方法，该方法能将复杂的医学文本转换为更简单的版本，同时保持生成文本的质量。

方法

开发了一种使用强化学习和基于Transformer的语言模型的TS方法。对相关性奖励、弗莱什-金凯德奖励和词汇简单性奖励进行了优化，以帮助将充斥着行话的复杂医学段落简化为更简单的版本，同时保留文本质量。该模型使用3568个复杂-简单医学段落进行训练，并通过自动指标和人工标注对480个段落进行评估。

结果

所提出的方法在弗莱什-金凯德分数（11.84）上优于先前的基线，在使用ROUGE-1（0.39）、ROUGE-2（0.11）和SARI分数（0.40）进行测量时，与其他基线取得了相当的性能。人工评估表明，在考虑流畅性、连贯性和充分性等因素时，人工标注者之间的百分比一致性超过70%。

结论

成功开发了一种独特的医学TS方法，该方法利用强化学习并准确简化复杂的医学段落，从而提高其可读性。所提出的TS方法可应用于为复杂的医学文本数据自动生成简化文本，这将提高生物医学研究对更广泛受众的可及性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f10f/9719064/aedb8734ba03/medinform_v10i11e38095_fig1.jpg

相似文献

Medical Text Simplification Using Reinforcement Learning (TESLEA): Deep Learning-Based Text Simplification Approach.

JMIR Med Inform. 2022 Nov 18;10(11):e38095. doi: 10.2196/38095.

SATS: simplification aware text summarization of scientific documents.

Front Artif Intell. 2024 Jul 10;7:1375419. doi: 10.3389/frai.2024.1375419. eCollection 2024.

Health text simplification: An annotated corpus for digestive cancer education and novel strategies for reinforcement learning.

J Biomed Inform. 2024 Oct;158:104727. doi: 10.1016/j.jbi.2024.104727. Epub 2024 Sep 16.

User evaluation of the effects of a text simplification algorithm using term familiarity on perception, understanding, learning, and information retention.

J Med Internet Res. 2013 Jul 31;15(7):e144. doi: 10.2196/jmir.2569.

A user-study measuring the effects of lexical simplification and coherence enhancement on perceived and actual text difficulty.

Int J Med Inform. 2013 Aug;82(8):717-30. doi: 10.1016/j.ijmedinf.2013.03.001. Epub 2013 Apr 29.

Research on automatic pilot repetition generation method based on deep reinforcement learning.

Front Neurorobot. 2023 Oct 11;17:1285831. doi: 10.3389/fnbot.2023.1285831. eCollection 2023.

Paragraph-level Simplification of Medical Texts.

Proc Conf. 2021 Jun;2021:4972-4984. doi: 10.18653/v1/2021.naacl-main.395.

It's not just a phase: Investigating text simplification in a second language from a process and product perspective.

Front Artif Intell. 2022 Sep 12;5:983008. doi: 10.3389/frai.2022.983008. eCollection 2022.

Transformer-based active learning for multi-class text annotation and classification.

Digit Health. 2024 Oct 17;10:20552076241287357. doi: 10.1177/20552076241287357. eCollection 2024 Jan-Dec.

Towards more patient friendly clinical notes through language models and ontologies.

AMIA Annu Symp Proc. 2022 Feb 21;2021:881-890. eCollection 2021.

引用本文的文献

The use of large language models to enhance cancer clinical trial educational materials.

JNCI Cancer Spectr. 2025 Mar 3;9(2). doi: 10.1093/jncics/pkaf021.

A review of reinforcement learning for natural language processing and applications in healthcare.

J Am Med Inform Assoc. 2024 Oct 1;31(10):2379-2393. doi: 10.1093/jamia/ocae215.

Text and Audio Simplification: Human vs. ChatGPT.

AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:295-304. eCollection 2024.

Biomedical text readability after hypernym substitution with fine-tuned large language models.

PLOS Digit Health. 2024 Apr 16;3(4):e0000489. doi: 10.1371/journal.pdig.0000489. eCollection 2024 Apr.

Year 2022 in Medical Natural Language Processing: Availability of Language Models as a Step in the Democratization of NLP in the Biomedical Area.

Yearb Med Inform. 2023 Aug;32(1):244-252. doi: 10.1055/s-0043-1768752. Epub 2023 Dec 26.

本文引用的文献

Paragraph-level Simplification of Medical Texts.

Proc Conf. 2021 Jun;2021:4972-4984. doi: 10.18653/v1/2021.naacl-main.395.

Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.

J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.

Clinical Text Data in Machine Learning: Systematic Review.

JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用强化学习的医学文本简化（TESLEA）：基于深度学习的文本简化方法。

Medical Text Simplification Using Reinforcement Learning (TESLEA): Deep Learning-Based Text Simplification Approach.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献