• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于为患者住院分配临床代码的结构化和非结构化数据源的数据整合。

Data integration of structured and unstructured sources for assigning clinical codes to patient stays.

作者信息

Scheurwegs Elyne, Luyckx Kim, Luyten Léon, Daelemans Walter, Van den Bulcke Tim

机构信息

ADReM (Advanced Database Research and Modelling), Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp, Antwerp, Belgium

Department of Medical Information, Antwerp University Hospital, Antwerp, Belgium.

出版信息

J Am Med Inform Assoc. 2016 Apr;23(e1):e11-9. doi: 10.1093/jamia/ocv115. Epub 2015 Aug 27.

DOI:10.1093/jamia/ocv115
PMID:26316458
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4954635/
Abstract

OBJECTIVE

Enormous amounts of healthcare data are becoming increasingly accessible through the large-scale adoption of electronic health records. In this work, structured and unstructured (textual) data are combined to assign clinical diagnostic and procedural codes (specifically ICD-9-CM) to patient stays. We investigate whether integrating these heterogeneous data types improves prediction strength compared to using the data types in isolation.

METHODS

Two separate data integration approaches were evaluated. Early data integration combines features of several sources within a single model, and late data integration learns a separate model per data source and combines these predictions with a meta-learner. This is evaluated on data sources and clinical codes from a broad set of medical specialties.

RESULTS

When compared with the best individual prediction source, late data integration leads to improvements in predictive power (eg, overall F-measure increased from 30.6% to 38.3% for International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic codes), while early data integration is less consistent. The predictive strength strongly differs between medical specialties, both for ICD-9-CM diagnostic and procedural codes.

DISCUSSION

Structured data provides complementary information to unstructured data (and vice versa) for predicting ICD-9-CM codes. This can be captured most effectively by the proposed late data integration approach.

CONCLUSIONS

We demonstrated that models using multiple electronic health record data sources systematically outperform models using data sources in isolation in the task of predicting ICD-9-CM codes over a broad range of medical specialties.

摘要

目的

通过大规模采用电子健康记录,大量医疗保健数据变得越来越容易获取。在这项工作中,结构化和非结构化(文本)数据被结合起来为患者住院期间分配临床诊断和程序代码(具体为ICD-9-CM)。我们研究了与单独使用这些数据类型相比,整合这些异构数据类型是否能提高预测强度。

方法

评估了两种不同的数据整合方法。早期数据整合在单个模型中结合多个来源的特征,而后期数据整合则为每个数据源学习一个单独的模型,并将这些预测结果与一个元学习器相结合。这在来自广泛医学专科的数据源和临床代码上进行了评估。

结果

与最佳的单个预测源相比,后期数据整合提高了预测能力(例如,对于国际疾病分类第九版临床修订本(ICD-9-CM)诊断代码,总体F值从30.6%提高到38.3%),而早期数据整合的一致性较差。对于ICD-9-CM诊断和程序代码,预测强度在不同医学专科之间有很大差异。

讨论

在预测ICD-9-CM代码时,结构化数据为非结构化数据提供了补充信息(反之亦然)。所提出的后期数据整合方法能够最有效地捕捉这些信息。

结论

我们证明,在广泛的医学专科中预测ICD-9-CM代码的任务中,使用多个电子健康记录数据源的模型系统地优于单独使用数据源的模型。

相似文献

1
Data integration of structured and unstructured sources for assigning clinical codes to patient stays.用于为患者住院分配临床代码的结构化和非结构化数据源的数据整合。
J Am Med Inform Assoc. 2016 Apr;23(e1):e11-9. doi: 10.1093/jamia/ocv115. Epub 2015 Aug 27.
2
Selecting relevant features from the electronic health record for clinical code prediction.从电子健康记录中选择与临床代码预测相关的特征。
J Biomed Inform. 2017 Oct;74:92-103. doi: 10.1016/j.jbi.2017.09.004. Epub 2017 Sep 14.
3
Neural transfer learning for assigning diagnosis codes to EMRs.将诊断编码分配给电子病历的神经迁移学习。
Artif Intell Med. 2019 May;96:116-122. doi: 10.1016/j.artmed.2019.04.002. Epub 2019 Apr 12.
4
An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records.对监督学习方法在为电子病历分配诊断代码中的实证评估。
Artif Intell Med. 2015 Oct;65(2):155-66. doi: 10.1016/j.artmed.2015.04.007. Epub 2015 May 15.
5
Accuracy of claim data in the identification and classification of adults with congenital heart diseases in electronic medical records.电子病历中索赔数据对成人先天性心脏病的识别和分类的准确性。
Arch Cardiovasc Dis. 2019 Jan;112(1):31-43. doi: 10.1016/j.acvd.2018.07.002. Epub 2019 Jan 3.
6
Simulation of ICD-9 to ICD-10-CM Transition for Family Medicine: Simple or Convoluted?家庭医学中从国际疾病分类第九版(ICD - 9)向国际疾病分类第十版临床修正本(ICD - 10 - CM)转换的模拟:简单还是复杂?
J Am Board Fam Med. 2016 Jan-Feb;29(1):29-36. doi: 10.3122/jabfm.2016.01.150146.
7
Detection of probable dementia cases in undiagnosed patients using structured and unstructured electronic health records.使用结构化和非结构化电子健康记录检测未确诊患者中的可能痴呆病例。
BMC Med Inform Decis Mak. 2019 Jul 9;19(1):128. doi: 10.1186/s12911-019-0846-4.
8
Predictive modeling of structured electronic health records for adverse drug event detection.用于不良药物事件检测的结构化电子健康记录预测建模
BMC Med Inform Decis Mak. 2015;15 Suppl 4(Suppl 4):S1. doi: 10.1186/1472-6947-15-S4-S1. Epub 2015 Nov 25.
9
Utility of a combined current procedural terminology and International Classification of Diseases, Ninth Revision, Clinical Modification code algorithm in classifying cervical spine surgery for degenerative changes.联合现行操作术语和国际疾病分类,第九修订版,临床修正码算法在分类退行性改变的颈椎手术中的应用。
Spine (Phila Pa 1976). 2011 Oct 15;36(22):1843-8. doi: 10.1097/BRS.0b013e3181f7a943.
10
Automated ICD coding via unsupervised knowledge integration (UNITE).基于无监督知识集成的 ICD 编码自动化(UNITE)。
Int J Med Inform. 2020 Jul;139:104135. doi: 10.1016/j.ijmedinf.2020.104135. Epub 2020 Apr 4.

引用本文的文献

1
Effective Structured Information Extraction from Chest Radiography Reports Using Open-Weights Large Language Models.使用开放权重的大语言模型从胸部X光报告中进行有效的结构化信息提取。
Radiology. 2025 Jan;314(1):e243659. doi: 10.1148/radiol.243659.
2
Issues and Limitations on the Road to Fair and Inclusive AI Solutions for Biomedical Challenges.通往公平且包容的生物医学挑战人工智能解决方案之路上的问题与局限
Sensors (Basel). 2025 Jan 2;25(1):205. doi: 10.3390/s25010205.
3
Development of a text mining algorithm for identifying adverse drug reactions in electronic health records.开发一种用于识别电子健康记录中药物不良反应的文本挖掘算法。
JAMIA Open. 2024 Aug 16;7(3):ooae070. doi: 10.1093/jamiaopen/ooae070. eCollection 2024 Oct.
4
Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review.用于从电子健康记录中提取日常生活活动信息的自然语言处理系统。一项系统综述。
JAMIA Open. 2024 May 24;7(2):ooae044. doi: 10.1093/jamiaopen/ooae044. eCollection 2024 Jul.
5
Migraine headache (MH) classification using machine learning methods with data augmentation.使用机器学习方法并结合数据增强技术进行偏头痛(MH)分类。
Sci Rep. 2024 Mar 2;14(1):5180. doi: 10.1038/s41598-024-55874-0.
6
Monitoring the Epidemiology of Otitis Using Free-Text Pediatric Medical Notes: A Deep Learning Approach.使用自由文本儿科医学记录监测中耳炎流行病学:一种深度学习方法。
J Pers Med. 2023 Dec 25;14(1):28. doi: 10.3390/jpm14010028.
7
Predicting adolescent suicidal behavior following inpatient discharge using structured and unstructured data.利用结构化和非结构化数据预测住院后青少年自杀行为。
J Affect Disord. 2024 Apr 1;350:382-387. doi: 10.1016/j.jad.2023.12.059. Epub 2023 Dec 28.
8
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
9
The Use and Structure of Emergency Nurses' Triage Narrative Data: Scoping Review.急诊护士分诊叙事数据的使用与结构:范围综述
JMIR Nurs. 2023 Jan 13;6:e41331. doi: 10.2196/41331.
10
Emergency nurses' triage narrative data, their uses and structure: a scoping review protocol.急诊护士分诊叙述性数据、其用途及结构:一项范围综述方案
BMJ Open. 2022 Apr 13;12(4):e055132. doi: 10.1136/bmjopen-2021-055132.

本文引用的文献

1
DISEASES: text mining and data integration of disease-gene associations.疾病:疾病-基因关联的文本挖掘与数据整合
Methods. 2015 Mar;74:83-9. doi: 10.1016/j.ymeth.2014.11.020. Epub 2014 Dec 5.
2
Validity of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) screening for sepsis in surgical mortalities.《疾病和有关健康问题的国际统计分类,第九次修订本,临床修正版》(ICD-9-CM)在外科手术死亡病例中筛查脓毒症的有效性。
Surg Infect (Larchmt). 2014 Oct;15(5):513-6. doi: 10.1089/sur.2013.089. Epub 2014 May 28.
3
R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment.R全表型组关联研究分析与绘图工具:用于R环境下全表型组关联研究的数据分析与绘图工具。
Bioinformatics. 2014 Aug 15;30(16):2375-6. doi: 10.1093/bioinformatics/btu197. Epub 2014 Apr 14.
4
Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis.将结构化数据和非结构化数据相结合,以确定接受透析治疗的 ICU 患者队列。
J Am Med Inform Assoc. 2014 Sep-Oct;21(5):801-7. doi: 10.1136/amiajnl-2013-001915. Epub 2014 Jan 2.
5
Diagnosis code assignment: models and evaluation metrics.诊断码分配:模型和评估指标。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.
6
Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium.电子健康记录的高通量表型标准化和规范化:SHARPn 联盟。
J Am Med Inform Assoc. 2013 Dec;20(e2):e341-8. doi: 10.1136/amiajnl-2013-001939. Epub 2013 Nov 4.
7
Improving the electronic health record--are clinicians getting what they wished for?改善电子健康记录——临床医生得到他们想要的了吗?
JAMA. 2013 Mar 13;309(10):991-2. doi: 10.1001/jama.2013.890.
8
Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.多参数智能监护在重症监护中的应用 II:一个公共接入重症监护病房数据库。
Crit Care Med. 2011 May;39(5):952-60. doi: 10.1097/CCM.0b013e31820a92c6.
9
A systematic literature review of automated clinical coding and classification systems.自动化临床编码和分类系统的系统文献回顾。
J Am Med Inform Assoc. 2010 Nov-Dec;17(6):646-51. doi: 10.1136/jamia.2009.001024.
10
Data integration in genetics and genomics: methods and challenges.遗传学与基因组学中的数据整合:方法与挑战
Hum Genomics Proteomics. 2009 Jan 12;2009:869093. doi: 10.4061/2009/869093.