印度一家医院的重症监护出院小结自由文本的密集标注及临床自然语言处理标注器的相关性能

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator.

作者信息

Ramanan S V, Radhakrishna Kedar, Waghmare Abijeet, Raj Tony, Nathan Senthil P, Sreerama Sai Madhukar, Sampath Sriram

机构信息

RelAgent Technologies (P) Limited, IIT Madras Research Park, #14, 1st Floor, Taramani, Chennai, 600113, India.

Division of Medical Informatics, St. John's Research Institute, 100 Feet Road, Koramangala, Bangalore, 560034, India.

出版信息

J Med Syst. 2016 Aug;40(8):187. doi: 10.1007/s10916-016-0541-2. Epub 2016 Jun 24.

DOI:10.1007/s10916-016-0541-2

PMID:27342107

Abstract

Electronic Health Record (EHR) use in India is generally poor, and structured clinical information is mostly lacking. This work is the first attempt aimed at evaluating unstructured text mining for extracting relevant clinical information from Indian clinical records. We annotated a corpus of 250 discharge summaries from an Intensive Care Unit (ICU) in India, with markups for diseases, procedures, and lab parameters, their attributes, as well as key demographic information and administrative variables such as patient outcomes. In this process, we have constructed guidelines for an annotation scheme useful to clinicians in the Indian context. We evaluated the performance of an NLP engine, Cocoa, on a cohort of these Indian clinical records. We have produced an annotated corpus of roughly 90 thousand words, which to our knowledge is the first tagged clinical corpus from India. Cocoa was evaluated on a test corpus of 50 documents. The overlap F-scores across the major categories, namely disease/symptoms, procedures, laboratory parameters and outcomes, are 0.856, 0.834, 0.961 and 0.872 respectively. These results are competitive with results from recent shared tasks based on US records. The annotated corpus and associated results from the Cocoa engine indicate that unstructured text mining is a viable method for cohort analysis in the Indian clinical context, where structured EHR records are largely absent.

摘要

电子健康记录（EHR）在印度的使用情况普遍不佳，且大多缺乏结构化临床信息。这项工作是首次尝试评估非结构化文本挖掘，以从印度临床记录中提取相关临床信息。我们对来自印度一家重症监护病房（ICU）的250份出院小结语料库进行了注释，标注了疾病、手术、实验室参数、它们的属性，以及关键人口统计学信息和行政变量，如患者预后。在此过程中，我们构建了一套注释方案指南，对印度背景下的临床医生很有用。我们在这些印度临床记录的一个队列上评估了自然语言处理引擎Cocoa的性能。我们生成了一个约9万字的注释语料库，据我们所知，这是来自印度的首个带标签临床语料库。Cocoa在50份文档的测试语料库上进行了评估。在主要类别（即疾病/症状、手术、实验室参数和预后）上的重叠F值分别为0.856、0.834、0.961和0.872。这些结果与基于美国记录的近期共享任务结果具有竞争力。注释语料库以及Cocoa引擎的相关结果表明，在印度临床环境中，非结构化文本挖掘是一种可行的队列分析方法，因为那里结构化EHR记录基本不存在。

相似文献

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator.印度一家医院的重症监护出院小结自由文本的密集标注及临床自然语言处理标注器的相关性能

J Med Syst. 2016 Aug;40(8):187. doi: 10.1007/s10916-016-0541-2. Epub 2016 Jun 24.

PhenoDEF: a corpus for annotating sentences with information of phenotype definitions in biomedical literature.PhenoDEF：一个用于在生物医学文献中注释具有表型定义信息的句子的语料库。

J Biomed Semantics. 2022 Jun 11;13(1):17. doi: 10.1186/s13326-022-00272-6.

Detecting adverse drug reactions in discharge summaries of electronic medical records using Readpeer.使用 Readpeer 检测电子病历出院小结中的药物不良反应。

Int J Med Inform. 2019 Aug;128:62-70. doi: 10.1016/j.ijmedinf.2019.04.017. Epub 2019 May 25.

Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.使用文本挖掘技术从PhenoCHF语料库中提取表型信息。

BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S3. doi: 10.1186/1472-6947-15-S2-S3. Epub 2015 Jun 15.

Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review.电子健康记录中自由文本叙述的症状的自然语言处理：系统评价。

J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379. doi: 10.1093/jamia/ocy173.

Designing an openEHR-Based Pipeline for Extracting and Standardizing Unstructured Clinical Data Using Natural Language Processing.设计一个基于 openEHR 的管道，使用自然语言处理提取和标准化非结构化临床数据。

Methods Inf Med. 2020 Dec;59(S 02):e64-e78. doi: 10.1055/s-0040-1716403. Epub 2020 Oct 14.

Reviewing 741 patients records in two hours with FASTVISU.使用FASTVISU在两小时内查看741份患者记录。

AMIA Annu Symp Proc. 2015 Nov 5;2015:553-9. eCollection 2015.

Using natural language processing to identify problem usage of prescription opioids.使用自然语言处理来识别处方阿片类药物的问题使用情况。

Int J Med Inform. 2015 Dec;84(12):1057-64. doi: 10.1016/j.ijmedinf.2015.09.002. Epub 2015 Sep 25.

Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.扩大自然语言处理管道以处理大量临床记录语料库。

Methods Inf Med. 2015;54(6):548-52. doi: 10.3414/ME14-02-0018. Epub 2015 Nov 4.

Classification of the Disposition of Patients Hospitalized with COVID-19: Reading Discharge Summaries Using Natural Language Processing.COVID-19住院患者处置情况分类：使用自然语言处理技术阅读出院小结

JMIR Med Inform. 2021 Feb 10;9(2):e25457. doi: 10.2196/25457.

引用本文的文献

Identification of Gender Differences in Acute Myocardial Infarction Presentation and Management at Aga Khan University Hospital-Pakistan: Natural Language Processing Application in a Dataset of Patients With Cardiovascular Disease.巴基斯坦阿迦汗大学医院急性心肌梗死表现与治疗中的性别差异识别：心血管疾病患者数据集中的自然语言处理应用

JMIR Form Res. 2024 Dec 20;8:e42774. doi: 10.2196/42774.

Can antiepileptic efficacy and epilepsy variables be studied from electronic health records? A review of current approaches.电子健康记录能否用于研究抗癫痫药物的疗效和癫痫相关变量？当前方法综述。

Seizure. 2021 Feb;85:138-144. doi: 10.1016/j.seizure.2020.11.011. Epub 2021 Jan 13.

本文引用的文献

From Cues to Nudge: A Knowledge-Based Framework for Surveillance of Healthcare-Associated Infections.从线索到推动：基于知识的医疗保健相关性感染监测框架。

J Med Syst. 2016 Jan;40(1):23. doi: 10.1007/s10916-015-0364-6. Epub 2015 Nov 4.

FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.FIR：一种从社交媒体中提取有用元数据的有效方案。

J Med Syst. 2015 Nov;39(11):139. doi: 10.1007/s10916-015-0333-0. Epub 2015 Sep 2.

An electronic medical record system with treatment recommendations based on patient similarity.基于患者相似度的电子病历系统和治疗建议。

J Med Syst. 2015 May;39(5):55. doi: 10.1007/s10916-015-0237-z. Epub 2015 Mar 12.

Disease risk factors identified through shared genetic architecture and electronic medical records.通过共享遗传结构和电子病历识别出的疾病风险因素。

Sci Transl Med. 2014 Apr 30;6(234):234ra57. doi: 10.1126/scitranslmed.3007191.

Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives.开发和评估 RapTAT：一种用于从医学叙述中映射短语概念的机器学习系统。

J Biomed Inform. 2014 Apr;48:54-65. doi: 10.1016/j.jbi.2013.11.008. Epub 2013 Dec 4.

Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports.利用放射学报告的自然语言处理识别腹主动脉瘤病例与对照。

AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:249-53. eCollection 2013.

A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk.非简并性的孟德尔遗传位点有害变异编码有助于复杂疾病风险。

Cell. 2013 Sep 26;155(1):70-80. doi: 10.1016/j.cell.2013.08.030.

Evaluating temporal relations in clinical text: 2012 i2b2 Challenge.评估临床文本中的时间关系：2012 i2b2 挑战赛。

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13. doi: 10.1136/amiajnl-2013-001628. Epub 2013 Apr 5.

Improving adverse drug event detection in critically ill patients through screening intensive care unit transfer summaries.通过筛选重症监护病房转科小结提高危重症患者不良药物事件的检出率。

Pharmacoepidemiol Drug Saf. 2013 May;22(5):510-6. doi: 10.1002/pds.3422. Epub 2013 Feb 26.

Natural language processing accurately categorizes findings from colonoscopy and pathology reports.自然语言处理能准确地对结肠镜检查和病理报告的结果进行分类。

Clin Gastroenterol Hepatol. 2013 Jun;11(6):689-94. doi: 10.1016/j.cgh.2012.11.035. Epub 2013 Jan 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

印度一家医院的重症监护出院小结自由文本的密集标注及临床自然语言处理标注器的相关性能

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献