• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将机器学习方法推广应用于从全州范围的健康信息交换中识别应报告疾病。

Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.

作者信息

Dexter Gregory P, Grannis Shaun J, Dixon Brian E, Kasthurirathne Suranga N

机构信息

Center for Biomedical Informatics, Regenstrief Institute, Indianapolis, IN, USA.

Indiana University School of Medicine, Indianapolis, IN, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:152-161. eCollection 2020.

PMID:32477634
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7233074/
Abstract

Healthcare analytics is impeded by a lack of machine learning (ML) model generalizability, the ability of a model to predict accurately on varied data sources not included in the model's training dataset. We leveraged free-text laboratory data from a Health Information Exchange network to evaluate ML generalization using Notifiable Condition Detection (NCD) for public health surveillance as a use case. We 1) built ML models for detecting syphilis, salmonella, and histoplasmosis; 2) evaluated generalizability of these models across data from holdout lab systems, and; 3) explored factors that influence weak model generalizability. Models for predicting each disease reported considerable accuracy. However, they demonstrated poor generalizability across data from holdout lab systems being tested. Our evaluation determined that weak generalization was influenced by variant syntactic nature of free-text datasets across each lab system. Results highlight the need for actionable methodology to generalize ML solutions for healthcare analytics.

摘要

医疗保健分析受到机器学习(ML)模型缺乏通用性的阻碍,即模型在其训练数据集中未包含的各种数据源上准确预测的能力。我们利用来自健康信息交换网络的自由文本实验室数据,以公共卫生监测中的应报告疾病检测(NCD)为例,评估ML的通用性。我们1)构建了用于检测梅毒、沙门氏菌和组织胞浆菌病的ML模型;2)评估了这些模型在来自保留实验室系统的数据中的通用性;3)探索了影响模型通用性较弱的因素。预测每种疾病的模型都具有相当高的准确性。然而,它们在正在测试的保留实验室系统的数据中表现出较差的通用性。我们的评估确定,通用性较弱受到每个实验室系统中自由文本数据集不同句法性质的影响。结果突出了需要可行的方法来推广用于医疗保健分析的ML解决方案。

相似文献

1
Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.将机器学习方法推广应用于从全州范围的健康信息交换中识别应报告疾病。
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:152-161. eCollection 2020.
2
Generalization challenges in electrocardiogram deep learning: insights from dataset characteristics and attention mechanism.心电图深度学习中的泛化挑战:来自数据集特征和注意力机制的见解。
Future Cardiol. 2024 Mar 11;20(4):209-220. doi: 10.1080/14796678.2024.2354082. Epub 2024 Jun 5.
3
Notifiable condition reporting practices: implications for public health agency participation in a health information exchange.应报告疾病的报告实践:对公共卫生机构参与健康信息交换的影响
BMC Public Health. 2017 Mar 11;17(1):247. doi: 10.1186/s12889-017-4156-4.
4
Sex classification from functional brain connectivity: Generalization to multiple datasets Generalizability of sex classifiers.基于功能性脑连接的性别分类:推广至多个数据集 性别分类器的可推广性
bioRxiv. 2024 Mar 20:2023.08.30.555495. doi: 10.1101/2023.08.30.555495.
5
Notifiable diseases interoperable framework toward improving Iran public health surveillance system: Lessons learned from COVID-19 pandemic.改善伊朗公共卫生监测系统的法定传染病互操作性框架:从 COVID-19 大流行中吸取的教训
J Educ Health Promot. 2021 May 31;10(1):179. doi: 10.4103/jehp.jehp_1082_20. eCollection 2021.
6
Evaluating and Enhancing the Generalization Performance of Machine Learning Models for Physical Activity Intensity Prediction From Raw Acceleration Data.评估和增强基于原始加速度数据的体力活动强度预测机器学习模型的泛化性能。
IEEE J Biomed Health Inform. 2020 Jan;24(1):27-38. doi: 10.1109/JBHI.2019.2917565. Epub 2019 May 20.
7
EFAR-MMLA: An Evaluation Framework to Assess and Report Generalizability of Machine Learning Models in MMLA.EFAR-MMLA:用于评估和报告 MMLA 中机器学习模型泛化能力的评估框架。
Sensors (Basel). 2021 Apr 19;21(8):2863. doi: 10.3390/s21082863.
8
Applying Machine Learning Approaches to Suicide Prediction Using Healthcare Data: Overview and Future Directions.运用机器学习方法利用医疗数据进行自杀预测:概述与未来方向
Front Psychiatry. 2021 Aug 3;12:707916. doi: 10.3389/fpsyt.2021.707916. eCollection 2021.
9
A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.基于机器学习的数据驱动方法预测糖尿病和心血管疾病。
BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.
10
Machine learning approaches to analysing textual injury surveillance data: a systematic review.用于分析文本损伤监测数据的机器学习方法:一项系统综述。
Accid Anal Prev. 2015 Jun;79:41-9. doi: 10.1016/j.aap.2015.03.018. Epub 2015 Mar 19.

引用本文的文献

1
Development and validation of a dynamic early warning system with time-varying machine learning models for predicting hemodynamic instability in critical care: a multicohort study.用于预测重症监护中血流动力学不稳定的具有时变机器学习模型的动态预警系统的开发与验证:一项多队列研究
Crit Care. 2025 Jul 23;29(1):318. doi: 10.1186/s13054-025-05553-x.
2
The Role of Machine Learning Models in Predicting Cirrhosis Mortality: A Systematic Review.机器学习模型在预测肝硬化死亡率中的作用:一项系统综述。
Cureus. 2025 Jan 28;17(1):e78155. doi: 10.7759/cureus.78155. eCollection 2025 Jan.
3
Adaptable graph neural networks design to support generalizability for clinical event prediction.支持临床事件预测通用性的自适应图神经网络设计。
J Biomed Inform. 2025 Mar;163:104794. doi: 10.1016/j.jbi.2025.104794. Epub 2025 Feb 15.
4
Impact of Dataset Size on 3D CNN Performance in Intracranial Hemorrhage Classification.数据集大小对颅内出血分类中3D卷积神经网络性能的影响
Diagnostics (Basel). 2025 Jan 18;15(2):216. doi: 10.3390/diagnostics15020216.
5
Artificial Intelligence Algorithms in Health Care: Is the Current Food and Drug Administration Regulation Sufficient?医疗保健中的人工智能算法:当前美国食品药品监督管理局的监管是否足够?
JMIR AI. 2023 Jan 16;2:e42940. doi: 10.2196/42940.
6
Navigating the future: machine learning's role in revolutionizing antimicrobial stewardship and infection prevention and control.展望未来:机器学习在抗菌药物管理和感染预防与控制领域的变革中所扮演的角色。
Curr Opin Infect Dis. 2024 Aug 1;37(4):290-295. doi: 10.1097/QCO.0000000000001028. Epub 2024 May 31.
7
Empirical data drift detection experiments on real-world medical imaging data.基于真实医学成像数据的经验数据漂移检测实验。
Nat Commun. 2024 Feb 29;15(1):1887. doi: 10.1038/s41467-024-46142-w.
8
Artificial Intelligence Models in Health Information Exchange: A Systematic Review of Clinical Implications.健康信息交换中的人工智能模型:对临床意义的系统评价
Healthcare (Basel). 2023 Sep 19;11(18):2584. doi: 10.3390/healthcare11182584.
9
Computational methods applied to syphilis: where are we, and where are we going?应用于梅毒的计算方法:我们现在在哪里,我们要去哪里?
Front Public Health. 2023 Aug 23;11:1201725. doi: 10.3389/fpubh.2023.1201725. eCollection 2023.
10
A Deep Learning Framework for Deriving Noninvasive Intracranial Pressure Waveforms from Transcranial Doppler.基于经颅多普勒的颅内压无创波形推导深度学习框架
Ann Neurol. 2023 Jul;94(1):196-202. doi: 10.1002/ana.26682. Epub 2023 Jun 1.

本文引用的文献

1
Identification of Patients in Need of Advanced Care for Depression Using Data Extracted From a Statewide Health Information Exchange: A Machine Learning Approach.利用从全州范围的健康信息交换中提取的数据识别需要高级抑郁症护理的患者:一种机器学习方法。
J Med Internet Res. 2019 Jul 22;21(7):e13809. doi: 10.2196/13809.
2
Scalable and accurate deep learning with electronic health records.借助电子健康记录实现可扩展且准确的深度学习。
NPJ Digit Med. 2018 May 8;1:18. doi: 10.1038/s41746-018-0029-1. eCollection 2018.
3
Machine Learning Approaches to Identify Nicknames from A Statewide Health Information Exchange.从全州范围的健康信息交换中识别昵称的机器学习方法
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:639-647. eCollection 2019.
4
Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models.电子表型分析的进展:从基于规则的定义到机器学习模型
Annu Rev Biomed Data Sci. 2018 Jul;1:53-68. doi: 10.1146/annurev-biodatasci-080917-013315. Epub 2018 May 23.
5
Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.临床文档差异与自然语言处理系统的可移植性:跨机构哮喘出生队列的案例研究
J Am Med Inform Assoc. 2018 Mar 1;25(3):353-359. doi: 10.1093/jamia/ocx138.
6
Assessing the capacity of social determinants of health data to augment predictive models identifying patients in need of wraparound social services.评估健康社会决定因素数据的能力,以增强预测模型,识别需要全面社会服务的患者。
J Am Med Inform Assoc. 2018 Jan 1;25(1):47-53. doi: 10.1093/jamia/ocx130.
7
Toward better public health reporting using existing off the shelf approaches: The value of medical dictionaries in automated cancer detection using plaintext medical data.利用现有现成方法实现更好的公共卫生报告:医学词典在使用纯文本医学数据进行自动癌症检测中的价值。
J Biomed Inform. 2017 May;69:160-176. doi: 10.1016/j.jbi.2017.04.008. Epub 2017 Apr 12.
8
A machine learning-based framework to identify type 2 diabetes through electronic health records.一种基于机器学习的通过电子健康记录识别2型糖尿病的框架。
Int J Med Inform. 2017 Jan;97:120-127. doi: 10.1016/j.ijmedinf.2016.09.014. Epub 2016 Oct 1.
9
Using machine learning to parse breast pathology reports.使用机器学习解析乳腺病理报告。
Breast Cancer Res Treat. 2017 Jan;161(2):203-211. doi: 10.1007/s10549-016-4035-1. Epub 2016 Nov 8.
10
Toward better public health reporting using existing off the shelf approaches: A comparison of alternative cancer detection approaches using plaintext medical data and non-dictionary based feature selection.利用现有现成方法实现更好的公共卫生报告:使用纯文本医疗数据和基于非字典特征选择的替代癌症检测方法的比较
J Biomed Inform. 2016 Apr;60:145-52. doi: 10.1016/j.jbi.2016.01.008. Epub 2016 Jan 28.