将机器学习方法推广应用于从全州范围的健康信息交换中识别应报告疾病。

Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.

作者信息

Dexter Gregory P, Grannis Shaun J, Dixon Brian E, Kasthurirathne Suranga N

机构信息

Center for Biomedical Informatics, Regenstrief Institute, Indianapolis, IN, USA.

Indiana University School of Medicine, Indianapolis, IN, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:152-161. eCollection 2020.

PMID:32477634

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7233074/

Abstract

Healthcare analytics is impeded by a lack of machine learning (ML) model generalizability, the ability of a model to predict accurately on varied data sources not included in the model's training dataset. We leveraged free-text laboratory data from a Health Information Exchange network to evaluate ML generalization using Notifiable Condition Detection (NCD) for public health surveillance as a use case. We 1) built ML models for detecting syphilis, salmonella, and histoplasmosis; 2) evaluated generalizability of these models across data from holdout lab systems, and; 3) explored factors that influence weak model generalizability. Models for predicting each disease reported considerable accuracy. However, they demonstrated poor generalizability across data from holdout lab systems being tested. Our evaluation determined that weak generalization was influenced by variant syntactic nature of free-text datasets across each lab system. Results highlight the need for actionable methodology to generalize ML solutions for healthcare analytics.

摘要

医疗保健分析受到机器学习（ML）模型缺乏通用性的阻碍，即模型在其训练数据集中未包含的各种数据源上准确预测的能力。我们利用来自健康信息交换网络的自由文本实验室数据，以公共卫生监测中的应报告疾病检测（NCD）为例，评估ML的通用性。我们1）构建了用于检测梅毒、沙门氏菌和组织胞浆菌病的ML模型；2）评估了这些模型在来自保留实验室系统的数据中的通用性；3）探索了影响模型通用性较弱的因素。预测每种疾病的模型都具有相当高的准确性。然而，它们在正在测试的保留实验室系统的数据中表现出较差的通用性。我们的评估确定，通用性较弱受到每个实验室系统中自由文本数据集不同句法性质的影响。结果突出了需要可行的方法来推广用于医疗保健分析的ML解决方案。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

将机器学习方法推广应用于从全州范围的健康信息交换中识别应报告疾病。

Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

将机器学习方法推广应用于从全州范围的健康信息交换中识别应报告疾病。

Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献