• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在分类任务中提高健康证据质量:一种利用基于案例推理和过程特征的三角测量方法。

Enhance health evidence quality in classification tasks: A triangulation approach utilizing case-based reasoning and process features.

作者信息

Guo Ruihua, Smith Ross, Chen Qifan, Ritchie Angus, Poon Simon

机构信息

School of Computer Science, The University of Sydney, Sydney, NSW, Australia.

Population Health Group, Australian Institute of Health and Welfare, Canberra, ACT, Australia.

出版信息

Digit Health. 2025 Jan 17;11:20552076251314097. doi: 10.1177/20552076251314097. eCollection 2025 Jan-Dec.

DOI:10.1177/20552076251314097
PMID:39839956
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11748077/
Abstract

OBJECTIVE

Machine learning (ML) has enabled healthcare discoveries by facilitating efficient modeling, such as for cancer screening. Unlike clinical trials, real-world data used in ML are often gathered for multiple purposes, leading to bias and missing information for a specific classification task. This challenge is especially pronounced in healthcare because of stringent ethical considerations and resource constraints.This study proposed an integrated approach to enhance the quality of health evidence from a classification task for predicting Medicare's Diagnosis-Related Groups of ischemic heart disease (IHD) patients.

METHODS

Eligible participants were identified from the Medical Information Mart for Intensive Care IV (MIMIC IV), a publicly available hospital database. Six ML models were selected for model triangulation. Sequential triangulation was employed via Local Process Mining (LPM) and Qualitative Comparative Analysis (QCA).

RESULTS

A total of 1545 IHD hospitalizations from 916 patients were identified from the MIMIC IV. Eight health process features were identified through LPM aligned with clinical knowledge. The correlation coefficients for process features, ranging from 0.24 to 0.42, are higher than those for non-process features ranged from 0.02 to 0.36. A total of 56 unique combinations were identified from the QCA, with 28 configurations having raw coverage lower than 1.0%. The overall model performance (i.e. weighted F1 and area under the curve scores) increased after adopting this integrated approach. The proportion of cases misclassified by any of the six models decreased by 47% after incorporating process features (from 5.29% to 2.91%) and further decreased to 0.0% after applying the QCA solutions.

CONCLUSION

The integrated approach demonstrates its ability to enhance quality of a classification task through its clinical relevance, improved model performance, and reduced case-level error rates. However, more scalable QCA methods are needed for larger datasets. Developing health process feature engineering for broader applications can be a future direction.

摘要

目的

机器学习(ML)通过促进高效建模,如用于癌症筛查,推动了医疗保健领域的发现。与临床试验不同,ML中使用的真实世界数据通常是为多种目的收集的,这导致针对特定分类任务存在偏差和信息缺失。由于严格的伦理考量和资源限制,这一挑战在医疗保健领域尤为突出。本研究提出了一种综合方法,以提高预测医疗保险缺血性心脏病(IHD)患者诊断相关组分类任务的健康证据质量。

方法

从公开可用的重症监护医学信息集市IV(MIMIC IV)中识别符合条件的参与者。选择六个ML模型进行模型三角剖分。通过局部过程挖掘(LPM)和定性比较分析(QCA)采用顺序三角剖分。

结果

从MIMIC IV中识别出916名患者的1545次IHD住院治疗。通过与临床知识一致的LPM识别出八个健康过程特征。过程特征的相关系数在0.24至0.42之间,高于非过程特征的相关系数(在0.02至0.36之间)。从QCA中总共识别出56种独特组合,其中28种配置的原始覆盖率低于1.0%。采用这种综合方法后,整体模型性能(即加权F1和曲线下面积得分)有所提高。纳入过程特征后,六个模型中任何一个错误分类的病例比例下降了47%(从5.29%降至2.91%),应用QCA解决方案后进一步降至0.0%。

结论

该综合方法通过其临床相关性、改进的模型性能和降低的病例级错误率,展示了提高分类任务质量的能力。然而,对于更大的数据集,需要更具可扩展性的QCA方法。为更广泛的应用开发健康过程特征工程可能是未来的一个方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d8b/11748077/ab6bbb6ed1cc/10.1177_20552076251314097-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d8b/11748077/ab6bbb6ed1cc/10.1177_20552076251314097-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d8b/11748077/ab6bbb6ed1cc/10.1177_20552076251314097-fig1.jpg

相似文献

1
Enhance health evidence quality in classification tasks: A triangulation approach utilizing case-based reasoning and process features.在分类任务中提高健康证据质量:一种利用基于案例推理和过程特征的三角测量方法。
Digit Health. 2025 Jan 17;11:20552076251314097. doi: 10.1177/20552076251314097. eCollection 2025 Jan-Dec.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Implementation and evaluation of a multivariate abstraction-based, interval-based dynamic time-warping method as a similarity measure for longitudinal medical records.基于多元抽象和区间的动态时间规整方法的实现和评估,作为一种用于纵向医疗记录的相似性度量方法。
J Biomed Inform. 2021 Nov;123:103919. doi: 10.1016/j.jbi.2021.103919. Epub 2021 Oct 8.
4
Prediction of mortality events of patients with acute heart failure in intensive care unit based on deep neural network.基于深度神经网络的重症监护病房急性心力衰竭患者死亡事件预测。
Comput Methods Programs Biomed. 2024 Nov;256:108403. doi: 10.1016/j.cmpb.2024.108403. Epub 2024 Aug 30.
5
Development and Validation of a Dynamic Real-Time Risk Prediction Model for Intensive Care Units Patients Based on Longitudinal Irregular Data: Multicenter Retrospective Study.基于纵向不规则数据的重症监护病房患者动态实时风险预测模型的开发与验证:多中心回顾性研究
J Med Internet Res. 2025 Apr 23;27:e69293. doi: 10.2196/69293.
6
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
7
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
8
Interpretable machine learning for 28-day all-cause in-hospital mortality prediction in critically ill patients with heart failure combined with hypertension: A retrospective cohort study based on medical information mart for intensive care database-IV and eICU databases.用于预测心力衰竭合并高血压重症患者28天全因院内死亡率的可解释机器学习:一项基于重症监护医学信息集市数据库-IV和电子重症监护病房数据库的回顾性队列研究
Front Cardiovasc Med. 2022 Oct 12;9:994359. doi: 10.3389/fcvm.2022.994359. eCollection 2022.
9
Stroke Outcome Measurements From Electronic Medical Records: Cross-sectional Study on the Effectiveness of Neural and Nonneural Classifiers.来自电子病历的中风结局测量:神经和非神经分类器有效性的横断面研究
JMIR Med Inform. 2021 Nov 1;9(11):e29120. doi: 10.2196/29120.
10
Comparison of Machine Learning Algorithms for Predicting Hospital Readmissions and Worsening Heart Failure Events in Patients With Heart Failure With Reduced Ejection Fraction: Modeling Study.射血分数降低的心力衰竭患者再入院及心力衰竭恶化事件预测的机器学习算法比较:建模研究
JMIR Form Res. 2023 Apr 17;7:e41775. doi: 10.2196/41775.

本文引用的文献

1
Machine Learning for the Preliminary Diagnosis of Dementia.用于痴呆症初步诊断的机器学习
Sci Program. 2020;2020:5629090. doi: 10.1155/2020/5629090. Epub 2020 Mar 7.
2
Magnitude of troponin elevation in patients with biomarker evidence of myocardial injury: relative frequency and outcomes in a cohort study across a large healthcare system.标志物证据提示心肌损伤患者肌钙蛋白升高的幅度:在一个大型医疗保健系统中的队列研究中的相对频率和结局。
BMC Cardiovasc Disord. 2023 Mar 24;23(1):151. doi: 10.1186/s12872-023-03168-0.
3
MIMIC-IV, a freely accessible electronic health record dataset.
MIMIC-IV,一个可自由访问的电子健康记录数据集。
Sci Data. 2023 Jan 3;10(1):1. doi: 10.1038/s41597-022-01899-x.
4
Principles, Scope, and Limitations of the Methodological Triangulation.方法学三角测量的原则、范围和局限性。
Invest Educ Enferm. 2022 Jun;40(2). doi: 10.17533/udea.iee.v40n2e03.
5
Strengthening systematic reviews in public health: guidance in the Cochrane Handbook for Systematic Reviews of Interventions, 2nd edition.加强公共卫生系统评价:第二版《 Cochrane 干预系统评价手册》中的指导。
J Public Health (Oxf). 2022 Dec 1;44(4):e588-e592. doi: 10.1093/pubmed/fdac036.
6
Early prediction of diagnostic-related groups and estimation of hospital cost by processing clinical notes.通过处理临床记录对诊断相关分组进行早期预测并估算医院成本。
NPJ Digit Med. 2021 Jul 1;4(1):103. doi: 10.1038/s41746-021-00474-9.
7
Time to reality check the promises of machine learning-powered precision medicine.是时候对机器学习驱动的精准医学的承诺进行现实检验了。
Lancet Digit Health. 2020 Dec;2(12):e677-e680. doi: 10.1016/S2589-7500(20)30200-4. Epub 2020 Sep 16.
8
Impact of COVID-19 Pandemic on the Overall Diagnostic and Therapeutic Process for Patients of Emergency Department and Those with Acute Cerebrovascular Disease.新型冠状病毒肺炎大流行对急诊科患者及急性脑血管病患者整体诊疗过程的影响
J Clin Med. 2020 Nov 26;9(12):3842. doi: 10.3390/jcm9123842.
9
Trends in Hospitalizations for Heart Failure and Ischemic Heart Disease Among US Adults With Diabetes.美国糖尿病成年人因心力衰竭和缺血性心脏病住院的趋势。
JAMA Cardiol. 2021 Mar 1;6(3):354-357. doi: 10.1001/jamacardio.2020.5921.
10
Bias and ethical considerations in machine learning and the automation of perioperative risk assessment.机器学习中的偏差与伦理考量以及围手术期风险评估的自动化
Br J Anaesth. 2020 Dec;125(6):843-846. doi: 10.1016/j.bja.2020.07.040. Epub 2020 Aug 21.