• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从临床叙述中提取患者家族病史:使用深度学习模型探索端到端解决方案

Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models.

作者信息

Yang Xi, Zhang Hansi, He Xing, Bian Jiang, Wu Yonghui

机构信息

Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.

Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, United States.

出版信息

JMIR Med Inform. 2020 Dec 15;8(12):e22982. doi: 10.2196/22982.

DOI:10.2196/22982
PMID:33320104
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7772072/
Abstract

BACKGROUND

Patients' family history (FH) is a critical risk factor associated with numerous diseases. However, FH information is not well captured in the structured database but often documented in clinical narratives. Natural language processing (NLP) is the key technology to extract patients' FH from clinical narratives. In 2019, the National NLP Clinical Challenge (n2c2) organized shared tasks to solicit NLP methods for FH information extraction.

OBJECTIVE

This study presents our end-to-end FH extraction system developed during the 2019 n2c2 open shared task as well as the new transformer-based models that we developed after the challenge. We seek to develop a machine learning-based solution for FH information extraction without task-specific rules created by hand.

METHODS

We developed deep learning-based systems for FH concept extraction and relation identification. We explored deep learning models including long short-term memory-conditional random fields and bidirectional encoder representations from transformers (BERT) as well as developed ensemble models using a majority voting strategy. To further optimize performance, we systematically compared 3 different strategies to use BERT output representations for relation identification.

RESULTS

Our system was among the top-ranked systems (3 out of 21) in the challenge. Our best system achieved micro-averaged F1 scores of 0.7944 and 0.6544 for concept extraction and relation identification, respectively. After challenge, we further explored new transformer-based models and improved the performances of both subtasks to 0.8249 and 0.6775, respectively. For relation identification, our system achieved a performance comparable to the best system (0.6810) reported in the challenge.

CONCLUSIONS

This study demonstrated the feasibility of utilizing deep learning methods to extract FH information from clinical narratives.

摘要

背景

患者家族史(FH)是与多种疾病相关的关键风险因素。然而,FH信息在结构化数据库中并未得到很好的记录,而是常常记录在临床叙述中。自然语言处理(NLP)是从临床叙述中提取患者FH的关键技术。2019年,国家NLP临床挑战赛(n2c2)组织了共享任务,以征集用于FH信息提取的NLP方法。

目的

本研究展示了我们在2019年n2c2开放共享任务期间开发的端到端FH提取系统,以及我们在挑战赛之后开发的基于新型变换器的模型。我们寻求开发一种基于机器学习的解决方案,用于FH信息提取,而无需手工创建特定于任务的规则。

方法

我们开发了基于深度学习的系统,用于FH概念提取和关系识别。我们探索了深度学习模型,包括长短期记忆条件随机场和变换器双向编码器表征(BERT),并使用多数投票策略开发了集成模型。为了进一步优化性能,我们系统地比较了3种不同的策略,以使用BERT输出表征进行关系识别。

结果

我们的系统在挑战赛中位列顶级系统(21个中的第3名)。我们的最佳系统在概念提取和关系识别方面分别取得了0.7944和0.6544的微平均F1分数。挑战赛之后,我们进一步探索了基于新型变换器的模型,并将两个子任务的性能分别提高到了0.8249和0.6775。对于关系识别,我们的系统取得了与挑战赛中报告的最佳系统(0.6810)相当的性能。

结论

本研究证明了利用深度学习方法从临床叙述中提取FH信息的可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/c9c21d2774e5/medinform_v8i12e22982_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/3d116b7de642/medinform_v8i12e22982_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/d9fccfd7ae2d/medinform_v8i12e22982_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/c9c21d2774e5/medinform_v8i12e22982_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/3d116b7de642/medinform_v8i12e22982_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/d9fccfd7ae2d/medinform_v8i12e22982_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd57/7772072/c9c21d2774e5/medinform_v8i12e22982_fig3.jpg

相似文献

1
Extracting Family History of Patients From Clinical Narratives: Exploring an End-to-End Solution With Deep Learning Models.从临床叙述中提取患者家族病史:使用深度学习模型探索端到端解决方案
JMIR Med Inform. 2020 Dec 15;8(12):e22982. doi: 10.2196/22982.
2
Family History Extraction From Synthetic Clinical Narratives Using Natural Language Processing: Overview and Evaluation of a Challenge Data Set and Solutions for the 2019 National NLP Clinical Challenges (n2c2)/Open Health Natural Language Processing (OHNLP) Competition.利用自然语言处理从合成临床叙述中提取家族病史:2019年国家自然语言处理临床挑战(n2c2)/开放健康自然语言处理(OHNLP)竞赛的挑战数据集概述与评估及解决方案
JMIR Med Inform. 2021 Jan 27;9(1):e24008. doi: 10.2196/24008.
3
A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.一种用于家族病史信息识别与关系抽取的混合模型:一个端到端信息抽取系统的开发与评估
JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.
4
Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports of Lung Cancer Screening Patients Using Transformer Models.使用Transformer模型从肺癌筛查患者的放射学报告中提取肺结节及结节特征
J Healthc Inform Res. 2024 May 17;8(3):463-477. doi: 10.1007/s41666-024-00166-5. eCollection 2024 Sep.
5
Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods.使用基于转换器的自然语言处理方法识别与糖尿病视网膜病变相关的临床概念及其属性。
BMC Med Inform Decis Mak. 2022 Sep 27;22(Suppl 3):255. doi: 10.1186/s12911-022-01996-2.
6
Clinical concept extraction using transformers.使用转换器进行临床概念提取。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.
7
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
8
Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models.临床文本中语义文本相似度的测量:基于Transformer模型的比较。
JMIR Med Inform. 2020 Nov 23;8(11):e19735. doi: 10.2196/19735.
9
Acquisition of a Lexicon for Family History Information: Bidirectional Encoder Representations From Transformers-Assisted Sublanguage Analysis.用于家族病史信息的词汇获取:基于Transformer辅助子语言分析的双向编码器表征
JMIR Med Inform. 2023 Jun 27;11:e48072. doi: 10.2196/48072.
10
Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study.从出院小结中提取药物名称及相关属性:文本挖掘研究
JMIR Med Inform. 2021 May 5;9(5):e24678. doi: 10.2196/24678.

引用本文的文献

1
Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports of Lung Cancer Screening Patients Using Transformer Models.使用Transformer模型从肺癌筛查患者的放射学报告中提取肺结节及结节特征
J Healthc Inform Res. 2024 May 17;8(3):463-477. doi: 10.1007/s41666-024-00166-5. eCollection 2024 Sep.
2
Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes using GPT Model.使用GPT模型从临床记录中提取社会决定因素和家族病史的最少指令零样本学习
Proc IEEE Int Conf Big Data. 2023 Dec;2023:1476-1480. doi: 10.1109/BigData59044.2023.10386811.
3

本文引用的文献

1
Selected articles from the BioCreative/OHNLP challenge 2018.2018年生物创意/OHNLP挑战赛精选文章。
BMC Med Inform Decis Mak. 2019 Dec 27;19(Suppl 10):262. doi: 10.1186/s12911-019-0994-6.
2
Family history information extraction via deep joint learning.通过深度联合学习提取家族史信息。
BMC Med Inform Decis Mak. 2019 Dec 27;19(Suppl 10):277. doi: 10.1186/s12911-019-0995-5.
3
Family member information extraction via neural sequence labeling models with different tag schemes.基于不同标记方案的神经序列标记模型的家庭成员信息抽取。
Deep Learning for Combating Misinformation in Multicategorical Text Contents.
用于对抗多类别文本内容中错误信息的深度学习
Sensors (Basel). 2023 Dec 7;23(24):9666. doi: 10.3390/s23249666.
4
Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma.基于临床概念的肺癌放射学报告分类流水线。
J Digit Imaging. 2023 Jun;36(3):812-826. doi: 10.1007/s10278-023-00787-z. Epub 2023 Feb 14.
5
A large language model for electronic health records.用于电子健康记录的大型语言模型。
NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.
6
The Real-World Experiences of Persons With Multiple Sclerosis During the First COVID-19 Lockdown: Application of Natural Language Processing.多发性硬化症患者在首次新冠疫情封锁期间的真实世界经历:自然语言处理的应用
JMIR Med Inform. 2022 Nov 10;10(11):e37945. doi: 10.2196/37945.
7
Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods.使用基于转换器的自然语言处理方法识别与糖尿病视网膜病变相关的临床概念及其属性。
BMC Med Inform Decis Mak. 2022 Sep 27;22(Suppl 3):255. doi: 10.1186/s12911-022-01996-2.
8
Identifying Patients Who Meet Criteria for Genetic Testing of Hereditary Cancers Based on Structured and Unstructured Family Health History Data in the Electronic Health Record: Natural Language Processing Approach.基于电子健康记录中的结构化和非结构化家庭健康史数据识别符合遗传性癌症基因检测标准的患者:自然语言处理方法
JMIR Med Inform. 2022 Aug 11;10(8):e37842. doi: 10.2196/37842.
9
The Value of Extracting Clinician-Recorded Affect for Advancing Clinical Research on Depression: Proof-of-Concept Study Applying Natural Language Processing to Electronic Health Records.提取临床医生记录的情感对推进抑郁症临床研究的价值:将自然语言处理应用于电子健康记录的概念验证研究
JMIR Form Res. 2022 May 12;6(5):e34436. doi: 10.2196/34436.
10
A Study of Social and Behavioral Determinants of Health in Lung Cancer Patients Using Transformers-based Natural Language Processing Models.基于变压器的自然语言处理模型研究肺癌患者健康的社会和行为决定因素。
AMIA Annu Symp Proc. 2022 Feb 21;2021:1225-1233. eCollection 2021.
BMC Med Inform Decis Mak. 2019 Dec 27;19(Suppl 10):257. doi: 10.1186/s12911-019-0996-4.
4
A study of deep learning methods for de-identification of clinical notes in cross-institute settings.深度学习方法在跨机构环境下对临床记录进行去识别的研究。
BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):232. doi: 10.1186/s12911-019-0935-4.
5
Deep learning in clinical natural language processing: a methodical review.深度学习在临床自然语言处理中的应用:系统综述。
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
6
2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records.2018n2c2 电子健康记录中药物不良反应和药物提取共享任务。
J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12. doi: 10.1093/jamia/ocz166.
7
Generative adversarial network in medical imaging: A review.生成对抗网络在医学影像中的应用:综述
Med Image Anal. 2019 Dec;58:101552. doi: 10.1016/j.media.2019.101552. Epub 2019 Aug 31.
8
Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting.利用递归卷积神经网络和梯度提升来识别药物与药物不良事件之间的关系。
J Am Med Inform Assoc. 2020 Jan 1;27(1):65-72. doi: 10.1093/jamia/ocz144.
9
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
10
Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods.基于集成深度学习方法的电子健康记录中的药物不良反应和药物关系提取。
J Am Med Inform Assoc. 2020 Jan 1;27(1):39-46. doi: 10.1093/jamia/ocz101.