• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从 COVID-19 临床病例报告中提取实体和关系:一种自然语言处理方法。

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.

机构信息

Public Health Ontario (PHO), Toronto, ON, Canada.

Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.

出版信息

BMC Med Inform Decis Mak. 2023 Jan 26;23(1):20. doi: 10.1186/s12911-023-02117-3.

DOI:10.1186/s12911-023-02117-3
PMID:36703154
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9879259/
Abstract

BACKGROUND

Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.

OBJECTIVE

This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.

METHODS

The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.

RESULTS

The named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.

CONCLUSIONS

A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.

摘要

背景

提取传染病相关信息是一项重要任务。然而,支持公共卫生研究的一个重大障碍是缺乏有效挖掘大量健康数据的方法。

目的

本研究旨在使用自然语言处理 (NLP) 从文献中的已发表病例中提取关键信息(临床因素、健康的社会决定因素)。

方法

该框架集成了一个数据层,用于从临床病例报告中准备数据队列;一个 NLP 层,用于在文本中找到临床和人口统计学命名实体和关系;以及一个评估层,用于基准性能和分析。本研究的重点是从 COVID-19 病例报告中提取有价值的信息。

结果

NLP 层中的命名实体识别实施在性能上比基准方法提高了约 1-3%。此外,即使没有广泛的数据标注,关系提取方法在准确性方面也优于基准方法(提高了 1-8%)。深入检查揭示了疾病在患者中的存在和症状流行情况。

结论

类似的方法可以推广到其他传染病。在研究其他传染病时,使用通过迁移学习获得的先验知识是值得的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/e33bdb653c20/12911_2023_2117_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/800edd83d45b/12911_2023_2117_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/88e56e4fb6ab/12911_2023_2117_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/59d1f04767c6/12911_2023_2117_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/950532564bf1/12911_2023_2117_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/91842336eb25/12911_2023_2117_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/5102d4c25b1d/12911_2023_2117_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/a20b9aca24e6/12911_2023_2117_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/6bf7a15c6c66/12911_2023_2117_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/e33bdb653c20/12911_2023_2117_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/800edd83d45b/12911_2023_2117_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/88e56e4fb6ab/12911_2023_2117_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/59d1f04767c6/12911_2023_2117_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/950532564bf1/12911_2023_2117_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/91842336eb25/12911_2023_2117_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/5102d4c25b1d/12911_2023_2117_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/a20b9aca24e6/12911_2023_2117_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/6bf7a15c6c66/12911_2023_2117_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7752/9881321/e33bdb653c20/12911_2023_2117_Fig9_HTML.jpg

相似文献

1
Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach.从 COVID-19 临床病例报告中提取实体和关系:一种自然语言处理方法。
BMC Med Inform Decis Mak. 2023 Jan 26;23(1):20. doi: 10.1186/s12911-023-02117-3.
2
Constructing a disease database and using natural language processing to capture and standardize free text clinical information.构建疾病数据库并使用自然语言处理技术来捕获和规范自由文本临床信息。
Sci Rep. 2023 May 26;13(1):8591. doi: 10.1038/s41598-023-35482-0.
3
Extracting entities with attributes in clinical text via joint deep learning.通过联合深度学习从临床文本中提取具有属性的实体。
J Am Med Inform Assoc. 2019 Dec 1;26(12):1584-1591. doi: 10.1093/jamia/ocz158.
4
Entity recognition from clinical texts via recurrent neural network.基于循环神经网络的临床文本实体识别。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.
5
Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder.自然语言处理(NLP)工具在从研究文章中提取生物医学概念中的应用:以自闭症谱系障碍为例。
BMC Med Inform Decis Mak. 2020 Dec 30;20(Suppl 11):322. doi: 10.1186/s12911-020-01352-2.
6
A Joint Extraction System Based on Conditional Layer Normalization for Health Monitoring.基于条件层归一化的健康监测联合提取系统。
Sensors (Basel). 2023 May 16;23(10):4812. doi: 10.3390/s23104812.
7
Clinical Application of Detecting COVID-19 Risks: A Natural Language Processing Approach.新冠病毒风险检测的临床应用:一种自然语言处理方法。
Viruses. 2022 Dec 11;14(12):2761. doi: 10.3390/v14122761.
8
Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.结合命名实体识别和未知词处理的本体事件抽取的主动学习
J Biomed Semantics. 2016 Apr 27;7:22. doi: 10.1186/s13326-016-0059-z. eCollection 2016.
9
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
10
Named Entity Recognition and Relation Detection for Biomedical Information Extraction.用于生物医学信息提取的命名实体识别与关系检测
Front Cell Dev Biol. 2020 Aug 28;8:673. doi: 10.3389/fcell.2020.00673. eCollection 2020.

引用本文的文献

1
Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study.使用检索增强大语言模型预测术后30天死亡率和美国麻醉医师协会身体状况:开发与验证研究
J Med Internet Res. 2025 Jun 3;27:e75052. doi: 10.2196/75052.
2
Applied artificial intelligence in dentistry: emerging data modalities and modeling approaches.人工智能在牙科中的应用:新兴数据模式与建模方法
Front Artif Intell. 2024 Jul 23;7:1427517. doi: 10.3389/frai.2024.1427517. eCollection 2024.
3
Exploring COVID-related relationship extraction: Contrasting data sources and analyzing misinformation.

本文引用的文献

1
Bilateral collagenous fibroma of the hard palate: a case report and review of the literature.硬腭双侧胶原纤维瘤:病例报告及文献复习。
J Med Case Rep. 2023 Jan 7;17(1):5. doi: 10.1186/s13256-022-03691-2.
2
CoQUAD: a COVID-19 question answering dataset system, facilitating research, benchmarking, and practice.CoQUAD:一个 COVID-19 问答数据集系统,促进研究、基准测试和实践。
BMC Bioinformatics. 2022 Jun 2;23(1):210. doi: 10.1186/s12859-022-04751-6.
3
Quantifying the effects of the COVID-19 pandemic on gender equality on health, social, and economic indicators: a comprehensive review of data from March, 2020, to September, 2021.
探索与新冠病毒相关的关系提取:对比数据源并分析错误信息。
Heliyon. 2024 Feb 28;10(5):e26973. doi: 10.1016/j.heliyon.2024.e26973. eCollection 2024 Mar 15.
4
A framework for multi-faceted content analysis of social media chatter regarding non-medical use of prescription medications.一个用于对社交媒体上有关处方药非医疗用途的闲聊进行多方面内容分析的框架。
BMC Digit Health. 2023;1. doi: 10.1186/s44247-023-00029-w. Epub 2023 Aug 7.
量化 COVID-19 大流行对健康、社会和经济指标性别平等的影响:对 2020 年 3 月至 2021 年 9 月数据的综合审查。
Lancet. 2022 Jun 25;399(10344):2381-2397. doi: 10.1016/S0140-6736(22)00008-3. Epub 2022 Mar 2.
4
Case Report: Overlap Between Long COVID and Functional Neurological Disorders.病例报告:长期新冠与功能性神经障碍的重叠
Front Neurol. 2022 Jan 28;12:811276. doi: 10.3389/fneur.2021.811276. eCollection 2021.
5
A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media.一种从临床文本中提取症状的深度语言模型及其在从社交媒体中提取 COVID-19 症状的应用。
IEEE J Biomed Health Inform. 2022 Apr;26(4):1737-1748. doi: 10.1109/JBHI.2021.3123192. Epub 2022 Apr 14.
6
Extracting social determinants of health from electronic health records using natural language processing: a systematic review.利用自然语言处理从电子健康记录中提取健康的社会决定因素:系统评价。
J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.
7
Economic impact of COVID-19 pandemic on healthcare facilities and systems: International perspectives.COVID-19 大流行对医疗保健设施和系统的经济影响:国际视角。
Best Pract Res Clin Anaesthesiol. 2021 Oct;35(3):293-306. doi: 10.1016/j.bpa.2020.11.009. Epub 2020 Nov 17.
8
COVID-19 patient with an incubation period of 27 d: A case report.潜伏期为27天的新型冠状病毒肺炎患者:一例病例报告
World J Clin Cases. 2021 Jul 26;9(21):5955-5962. doi: 10.12998/wjcc.v9.i21.5955.
9
Long COVID, a comprehensive systematic scoping review.长新冠,一项全面的系统范围综述。
Infection. 2021 Dec;49(6):1163-1186. doi: 10.1007/s15010-021-01666-x. Epub 2021 Jul 28.
10
Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework.从临床文本中提取 COVID-19 诊断和症状:一个新的带注释语料库和神经事件抽取框架。
J Biomed Inform. 2021 May;117:103761. doi: 10.1016/j.jbi.2021.103761. Epub 2021 Mar 26.