• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度神经网络的临床相关生物医学文本摘要:模型开发与验证。

Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.

机构信息

Department of Software, Sejong University, Seoul, Republic of Korea.

Department of Computer Science & Engineering, School of Engineering and Computer Science, Oakland University, Rochester, MI, United States.

出版信息

J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.

DOI:10.2196/19810
PMID:33095174
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7647812/
Abstract

BACKGROUND

Automatic text summarization (ATS) enables users to retrieve meaningful evidence from big data of biomedical repositories to make complex clinical decisions. Deep neural and recurrent networks outperform traditional machine-learning techniques in areas of natural language processing and computer vision; however, they are yet to be explored in the ATS domain, particularly for medical text summarization.

OBJECTIVE

Traditional approaches in ATS for biomedical text suffer from fundamental issues such as an inability to capture clinical context, quality of evidence, and purpose-driven selection of passages for the summary. We aimed to circumvent these limitations through achieving precise, succinct, and coherent information extraction from credible published biomedical resources, and to construct a simplified summary containing the most informative content that can offer a review particular to clinical needs.

METHODS

In our proposed approach, we introduce a novel framework, termed Biomed-Summarizer, that provides quality-aware Patient/Problem, Intervention, Comparison, and Outcome (PICO)-based intelligent and context-enabled summarization of biomedical text. Biomed-Summarizer integrates the prognosis quality recognition model with a clinical context-aware model to locate text sequences in the body of a biomedical article for use in the final summary. First, we developed a deep neural network binary classifier for quality recognition to acquire scientifically sound studies and filter out others. Second, we developed a bidirectional long-short term memory recurrent neural network as a clinical context-aware classifier, which was trained on semantically enriched features generated using a word-embedding tokenizer for identification of meaningful sentences representing PICO text sequences. Third, we calculated the similarity between query and PICO text sequences using Jaccard similarity with semantic enrichments, where the semantic enrichments are obtained using medical ontologies. Last, we generated a representative summary from the high-scoring PICO sequences aggregated by study type, publication credibility, and freshness score.

RESULTS

Evaluation of the prognosis quality recognition model using a large dataset of biomedical literature related to intracranial aneurysm showed an accuracy of 95.41% (2562/2686) in terms of recognizing quality articles. The clinical context-aware multiclass classifier outperformed the traditional machine-learning algorithms, including support vector machine, gradient boosted tree, linear regression, K-nearest neighbor, and naïve Bayes, by achieving 93% (16127/17341) accuracy for classifying five categories: aim, population, intervention, results, and outcome. The semantic similarity algorithm achieved a significant Pearson correlation coefficient of 0.61 (0-1 scale) on a well-known BIOSSES dataset (with 100 pair sentences) after semantic enrichment, representing an improvement of 8.9% over baseline Jaccard similarity. Finally, we found a highly positive correlation among the evaluations performed by three domain experts concerning different metrics, suggesting that the automated summarization is satisfactory.

CONCLUSIONS

By employing the proposed method Biomed-Summarizer, high accuracy in ATS was achieved, enabling seamless curation of research evidence from the biomedical literature to use for clinical decision-making.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/25a66c5630ea/jmir_v22i10e19810_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/c6d2dc4a21c0/jmir_v22i10e19810_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/aca2ca5a0a63/jmir_v22i10e19810_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/d5eae7b94292/jmir_v22i10e19810_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/25a66c5630ea/jmir_v22i10e19810_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/c6d2dc4a21c0/jmir_v22i10e19810_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/aca2ca5a0a63/jmir_v22i10e19810_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/d5eae7b94292/jmir_v22i10e19810_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7510/7647812/25a66c5630ea/jmir_v22i10e19810_fig4.jpg
摘要

背景

自动文本摘要(ATS)使用户能够从生物医学知识库的大数据中检索有意义的证据,从而做出复杂的临床决策。深度神经网络和循环网络在自然语言处理和计算机视觉领域超越了传统的机器学习技术;然而,它们在 ATS 领域尚未得到探索,特别是在医学文本摘要方面。

目的

生物医学文本的 ATS 传统方法存在无法捕获临床上下文、证据质量以及为摘要有针对性地选择段落等基本问题。我们旨在通过从可靠的已发表生物医学资源中精确、简洁和连贯地提取信息,并构建包含最具信息量的内容的简化摘要,从而克服这些限制,该摘要可提供特定于临床需求的综述。

方法

在我们提出的方法中,我们引入了一种名为 Biomed-Summarizer 的新框架,该框架提供基于质量感知的患者/问题(P)、干预(I)、比较(C)和结果(O)(PICO)的智能和上下文感知的生物医学文本摘要。 Biomed-Summarizer 将预后质量识别模型与临床上下文感知模型集成,以便在生物医学文章的正文中定位用于最终摘要的文本序列。首先,我们开发了一种深度神经网络二进制分类器用于质量识别,以获取科学合理的研究并过滤掉其他研究。其次,我们开发了一个双向长短时记忆循环神经网络作为临床上下文感知分类器,该分类器在使用词嵌入标记器生成的语义丰富特征上进行训练,以识别表示 PICO 文本序列的有意义的句子。第三,我们使用 Jaccard 相似性和语义丰富度来计算查询和 PICO 文本序列之间的相似性,其中语义丰富度是使用医学本体获得的。最后,我们根据研究类型、出版可信度和新鲜度评分,从高得分的 PICO 序列中生成代表性摘要。

结果

使用与颅内动脉瘤相关的大型生物医学文献数据集评估预后质量识别模型,在识别高质量文章方面的准确率为 95.41%(2562/2686)。临床上下文感知多类分类器的性能优于传统的机器学习算法,包括支持向量机、梯度提升树、线性回归、K-最近邻和朴素贝叶斯,对五类的分类准确率达到 93%(16127/17341):目标、人群、干预、结果和结局。语义相似性算法在经过语义丰富处理后,在著名的 BIOSSES 数据集(100 对句子)上实现了 0.61(0-1 标度)的显著皮尔逊相关系数,比基线 Jaccard 相似性提高了 8.9%。最后,我们发现三位领域专家对不同指标的评估之间存在高度正相关,这表明自动化摘要令人满意。

结论

通过使用提出的方法 Biomed-Summarizer,实现了 ATS 的高精度,从而能够从生物医学文献中无缝地整理研究证据,用于临床决策。

相似文献

1
Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.基于深度神经网络的临床相关生物医学文本摘要:模型开发与验证。
J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.
2
Quantifying the informativeness for biomedical literature summarization: An itemset mining method.量化生物医学文献摘要的信息量:一种基于项集挖掘的方法。
Comput Methods Programs Biomed. 2017 Jul;146:77-89. doi: 10.1016/j.cmpb.2017.05.011. Epub 2017 May 27.
3
CERC: an interactive content extraction, recognition, and construction tool for clinical and biomedical text.CERC:一个用于临床和生物医学文本的交互式内容提取、识别和构建工具。
BMC Med Inform Decis Mak. 2020 Dec 15;20(Suppl 14):306. doi: 10.1186/s12911-020-01330-8.
4
Neural sentence embedding models for semantic similarity estimation in the biomedical domain.生物医学领域中语义相似度估计的神经句子嵌入模型。
BMC Bioinformatics. 2019 Apr 11;20(1):178. doi: 10.1186/s12859-019-2789-2.
5
Deep contextualized embeddings for quantifying the informative content in biomedical text summarization.用于量化生物医学文本摘要是信息内容的深度语境化嵌入。
Comput Methods Programs Biomed. 2020 Feb;184:105117. doi: 10.1016/j.cmpb.2019.105117. Epub 2019 Oct 4.
6
Summarization of biomedical articles using domain-specific word embeddings and graph ranking.基于领域特定词嵌入和图排序的生物医学文章摘要。
J Biomed Inform. 2020 Jul;107:103452. doi: 10.1016/j.jbi.2020.103452. Epub 2020 May 19.
7
Clinical research text summarization method based on fusion of domain knowledge.基于领域知识融合的临床研究文本摘要方法。
J Biomed Inform. 2024 Aug;156:104668. doi: 10.1016/j.jbi.2024.104668. Epub 2024 Jun 8.
8
Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.基于图的生物医学文本摘要:一种基于项集挖掘和句子聚类的方法。
J Biomed Inform. 2018 Aug;84:42-58. doi: 10.1016/j.jbi.2018.06.005. Epub 2018 Jun 15.
9
deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.深度生物词汇语义消歧:生物医学文本数据的有效深度神经网络词汇语义消歧。
J Am Med Inform Assoc. 2019 May 1;26(5):438-446. doi: 10.1093/jamia/ocy189.
10
A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method.一种用于生物医学文献的基于连贯图的语义聚类与摘要方法及一种新的摘要评估方法。
BMC Bioinformatics. 2007 Nov 27;8 Suppl 9(Suppl 9):S4. doi: 10.1186/1471-2105-8-S9-S4.

引用本文的文献

1
Assessing the Role of Large Language Models Between ChatGPT and DeepSeek in Asthma Education for Bilingual Individuals: Comparative Study.评估ChatGPT和DeepSeek之间的大型语言模型在双语个体哮喘教育中的作用:比较研究
JMIR Med Inform. 2025 Aug 13;13:e65365. doi: 10.2196/65365.
2
Accuracy of smartwatches in predicting distance running performance.智能手表预测长跑成绩的准确性。
Front Sports Act Living. 2025 Jan 29;7:1517632. doi: 10.3389/fspor.2025.1517632. eCollection 2025.
3
Techniques for learning and transferring knowledge for microbiome-based classification and prediction: review and assessment.

本文引用的文献

1
Impact of Automatic Query Generation and Quality Recognition Using Deep Learning to Curate Evidence From Biomedical Literature: Empirical Study.利用深度学习自动生成查询和进行质量识别以整理生物医学文献证据的影响:实证研究
JMIR Med Inform. 2019 Dec 9;7(4):e13430. doi: 10.2196/13430.
2
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
3
A Deep Learning Method to Automatically Identify Reports of Scientifically Rigorous Clinical Research from the Biomedical Literature: Comparative Analytic Study.
基于微生物组的分类和预测的知识学习与转移技术:综述与评估
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf015.
4
Information Capsule: A New Approach for Summarizing Medical Information.信息胶囊:一种总结医学信息的新方法。
Int J Prev Med. 2024 Oct 18;15:52. doi: 10.4103/ijpvm.ijpvm_254_23. eCollection 2024.
5
The McMaster Health Information Research Unit: Over a Quarter-Century of Health Informatics Supporting Evidence-Based Medicine.麦克马斯特健康信息研究单位:二十五年多来支持循证医学的健康信息学。
J Med Internet Res. 2024 Jul 31;26:e58764. doi: 10.2196/58764.
6
Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study.探讨大型语言模型在总结心理健康咨询会话中的功效:基准研究。
JMIR Ment Health. 2024 Jul 23;11:e57306. doi: 10.2196/57306.
7
Predicting blood-brain barrier permeability of molecules with a large language model and machine learning.利用大语言模型和机器学习预测分子的血脑屏障通透性。
Sci Rep. 2024 Jul 9;14(1):15844. doi: 10.1038/s41598-024-66897-y.
8
Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials.比较从描述随机临床试验的摘要中提取信息的生成式方法和抽取式方法。
J Biomed Semantics. 2024 Apr 23;15(1):3. doi: 10.1186/s13326-024-00305-2.
9
Initial Development of an Automated Platform for Assessing Trainee Performance on Case Presentations.用于评估学员病例汇报表现的自动化平台的初步开发。
ATS Sch. 2022 Sep 23;3(4):548-560. doi: 10.34197/ats-scholar.2022-0010OC. eCollection 2022 Dec.
10
Medical Text Simplification Using Reinforcement Learning (TESLEA): Deep Learning-Based Text Simplification Approach.使用强化学习的医学文本简化(TESLEA):基于深度学习的文本简化方法。
JMIR Med Inform. 2022 Nov 18;10(11):e38095. doi: 10.2196/38095.
一种从生物医学文献中自动识别科学严谨的临床研究报告的深度学习方法:比较分析研究。
J Med Internet Res. 2018 Jun 25;20(6):e10281. doi: 10.2196/10281.
4
Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.基于图的生物医学文本摘要:一种基于项集挖掘和句子聚类的方法。
J Biomed Inform. 2018 Aug;84:42-58. doi: 10.1016/j.jbi.2018.06.005. Epub 2018 Jun 15.
5
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.BIOSSES:一种用于生物医学领域的语义句子相似度估计系统。
Bioinformatics. 2017 Jul 15;33(14):i49-i58. doi: 10.1093/bioinformatics/btx238.
6
Context-aware grading of quality evidences for evidence-based decision-making.基于上下文的质量证据分级,以支持循证决策。
Health Informatics J. 2019 Jun;25(2):429-445. doi: 10.1177/1460458217719560. Epub 2017 Aug 2.
7
Quantifying the informativeness for biomedical literature summarization: An itemset mining method.量化生物医学文献摘要的信息量:一种基于项集挖掘的方法。
Comput Methods Programs Biomed. 2017 Jul;146:77-89. doi: 10.1016/j.cmpb.2017.05.011. Epub 2017 May 27.
8
Extractive text summarization system to aid data extraction from full text in systematic review development.用于从系统综述开发的全文中辅助数据提取的抽取式文本摘要系统。
J Biomed Inform. 2016 Dec;64:265-272. doi: 10.1016/j.jbi.2016.10.014. Epub 2016 Oct 27.
9
Extracting PICO Sentences from Clinical Trial Reports using .使用……从临床试验报告中提取PICO句子
J Mach Learn Res. 2016;17.
10
Comparison of the time-to-indexing in PubMed between biomedical journals according to impact factor, discipline, and focus.根据影响因子、学科和重点比较生物医学期刊在 PubMed 中的索引时间。
Res Social Adm Pharm. 2017 Mar-Apr;13(2):389-393. doi: 10.1016/j.sapharm.2016.04.006. Epub 2016 May 5.