Med-MGF：基于多层次图的医学数据不平衡和表示处理框架。

Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation.

机构信息

Department of Industrial Engineering and Management, National University of Singapore, Singapore, 117576, Singapore.

Children's Emergency, KK Women's and Children's Hospital, Singapore, 229899, Singapore.

出版信息

BMC Med Inform Decis Mak. 2024 Sep 2;24(1):242. doi: 10.1186/s12911-024-02649-2.

DOI:10.1186/s12911-024-02649-2

PMID:39223567

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11367907/

Abstract

BACKGROUND

Modeling patient data, particularly electronic health records (EHR), is one of the major focuses of machine learning studies in healthcare, as these records provide clinicians with valuable information that can potentially assist them in disease diagnosis and decision-making.

METHODS

In this study, we present a multi-level graph-based framework called MedMGF, which models both patient medical profiles extracted from EHR data and their relationship network of health profiles in a single architecture. The medical profiles consist of several layers of data embedding derived from interval records obtained during hospitalization, and the patient-patient network is created by measuring the similarities between these profiles. We also propose a modification to the Focal Loss (FL) function to improve classification performance in imbalanced datasets without the need to imputate the data. MedMGF's performance was evaluated against several Graphical Convolutional Network (GCN) baseline models implemented with Binary Cross Entropy (BCE), FL, class balancing parameter , and Synthetic Minority Oversampling Technique (SMOTE).

RESULTS

Our proposed framework achieved high classification performance (AUC: 0.8098, ACC: 0.7503, SEN: 0.8750, SPE: 0.7445, NPV: 0.9923, PPV: 0.1367) on an extreme imbalanced pediatric sepsis dataset (n=3,014, imbalance ratio of 0.047). It yielded a classification improvement of 3.81% for AUC, 15% for SEN compared to the baseline GCN+ FL (AUC: 0.7717, ACC: 0.8144, SEN: 0.7250, SPE: 0.8185, PPV: 0.1559, NPV: 0.9847), and an improvement of 5.88% in AUC and 22.5% compared to GCN+FL+SMOTE (AUC: 0.7510, ACC: 0.8431, SEN: 0.6500, SPE: 0.8520, PPV: 0.1688, NPV: 0.9814). It also showed a classification improvement of 3.86% for AUC, 15% for SEN compared to the baseline GCN+ BCE (AUC: 0.7712, ACC: 0.8133, SEN: 0.7250, SPE: 0.8173, PPV: 0.1551, NPV: 0.9847), and an improvement of 14.33% in AUC and 27.5% in comparison to GCN+BCE+SMOTE (AUC: 0.6665, ACC: 0.7271, SEN: 0.6000, SPE: 0.7329, PPV: 0.0941, NPV: 0.9754).

CONCLUSION

When compared to all baseline models, MedMGF achieved the highest SEN and AUC results, demonstrating the potential for several healthcare applications.

摘要

背景

对患者数据（尤其是电子健康记录）进行建模是医疗保健领域机器学习研究的主要关注点之一，因为这些记录为临床医生提供了有价值的信息，可能有助于他们进行疾病诊断和决策。

方法

在本研究中，我们提出了一种称为 MedMGF 的多级图基框架，该框架在单个架构中同时对从电子健康记录数据中提取的患者医疗档案和他们的健康档案关系网络进行建模。医疗档案由住院期间获得的几个数据嵌入层组成，患者患者网络通过测量这些档案之间的相似性来创建。我们还提出了对焦点损失（FL）函数的修改，以在不需要输入数据的情况下提高不平衡数据集的分类性能。我们将 MedMGF 的性能与几种基于图形卷积网络（GCN）的基线模型进行了比较，这些基线模型使用二进制交叉熵（BCE）、FL、类别平衡参数和合成少数过采样技术（SMOTE）。

结果

我们提出的框架在一个极端不平衡的儿科败血症数据集（n=3014，不平衡率为 0.047）上实现了较高的分类性能（AUC：0.8098、ACC：0.7503、SEN：0.8750、SPE：0.7445、NPV：0.9923、PPV：0.1367）。与基线 GCN+FL 相比，AUC 提高了 3.81%，SEN 提高了 15%（AUC：0.7717、ACC：0.8144、SEN：0.7250、SPE：0.8185、PPV：0.1559、NPV：0.9847），与 GCN+FL+SMOTE 相比，AUC 提高了 5.88%，SEN 提高了 22.5%（AUC：0.7510、ACC：0.8431、SEN：0.6500、SPE：0.8520、PPV：0.1688、NPV：0.9814）。与基线 GCN+BCE 相比，AUC 提高了 3.86%，SEN 提高了 15%（AUC：0.7712、ACC：0.8133、SEN：0.7250、SPE：0.8173、PPV：0.1551、NPV：0.9847），与 GCN+BCE+SMOTE 相比，AUC 提高了 14.33%，SEN 提高了 27.5%（AUC：0.6665、ACC：0.7271、SEN：0.6000、SPE：0.7329、PPV：0.0941、NPV：0.9754）。

结论

与所有基线模型相比，MedMGF 实现了最高的 SEN 和 AUC 结果，表明其在多个医疗保健应用中有潜力。

相似文献

Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation.Med-MGF：基于多层次图的医学数据不平衡和表示处理框架。

BMC Med Inform Decis Mak. 2024 Sep 2;24(1):242. doi: 10.1186/s12911-024-02649-2.

Machine learning based readmission and mortality prediction in heart failure patients.基于机器学习的心力衰竭患者再入院和死亡率预测。

Sci Rep. 2023 Oct 31;13(1):18671. doi: 10.1038/s41598-023-45925-3.

Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略：以脑出血为例。

BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.COVID-19大流行期间临床护理中的人工智能：一项系统综述。

Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7.

NSCGCN: A novel deep GCN model to diagnosis COVID-19.NSCGCN：一种用于诊断 COVID-19 的新型深度 GCN 模型。

Comput Biol Med. 2022 Nov;150:106151. doi: 10.1016/j.compbiomed.2022.106151. Epub 2022 Sep 30.

Admission blood tests predicting survival of SARS-CoV-2 infected patients: a practical implementation of graph convolution network in imbalance dataset.入院血液检测预测 SARS-CoV-2 感染患者的生存：不平衡数据集图卷积网络的实际应用。

BMC Infect Dis. 2024 Aug 9;24(1):803. doi: 10.1186/s12879-024-09699-x.

Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms.创伤性损伤患者的医院死亡率预测：比较不同基于 SMOTE 的机器学习算法。

BMC Med Res Methodol. 2023 Apr 22;23(1):101. doi: 10.1186/s12874-023-01920-w.

PET radiomics-based lymphovascular invasion prediction in lung cancer using multiple segmentation and multi-machine learning algorithms.基于PET影像组学，运用多种分割和多机器学习算法预测肺癌中的淋巴管侵犯

Phys Eng Sci Med. 2024 Dec;47(4):1613-1625. doi: 10.1007/s13246-024-01475-0. Epub 2024 Sep 3.

MVS-GCN: A prior brain structure learning-guided multi-view graph convolution network for autism spectrum disorder diagnosis.MVS-GCN：一种基于先验脑结构学习的多视图图卷积网络自闭症谱系障碍诊断方法。

Comput Biol Med. 2022 Mar;142:105239. doi: 10.1016/j.compbiomed.2022.105239. Epub 2022 Jan 19.

Comparison of Machine Learning Algorithms Using Manual/Automated Features on 12-Lead Signal Electrocardiogram Classification: A Large Cohort Study on Students Aged Between 6 to 18 Years Old.基于 12 导联信号心电图的手动/自动特征的机器学习算法比较：一项针对 6 至 18 岁学生的大型队列研究。

Cardiovasc Eng Technol. 2023 Dec;14(6):786-800. doi: 10.1007/s13239-023-00687-x. Epub 2023 Oct 17.

本文引用的文献

Predicting disease onset from electronic health records for population health management: a scalable and explainable Deep Learning approach.利用电子健康记录预测疾病发作以进行人群健康管理：一种可扩展且可解释的深度学习方法。

Front Artif Intell. 2024 Jan 8;6:1287541. doi: 10.3389/frai.2023.1287541. eCollection 2023.

Challenges in and Opportunities for Electronic Health Record-Based Data Analysis and Interpretation.基于电子健康记录的数据分析和解释面临的挑战和机遇。

Gut Liver. 2024 Mar 15;18(2):201-208. doi: 10.5009/gnl230272. Epub 2023 Oct 31.

SCOPE: predicting future diagnoses in office visits using electronic health records.范围：使用电子健康记录预测门诊中的未来诊断。

Sci Rep. 2023 Jul 7;13(1):11005. doi: 10.1038/s41598-023-38257-9.

Deep learning prediction models based on EHR trajectories: A systematic review.基于电子健康记录轨迹的深度学习预测模型：系统评价。

J Biomed Inform. 2023 Aug;144:104430. doi: 10.1016/j.jbi.2023.104430. Epub 2023 Jun 26.

Effective diagnosis of sepsis in critically ill children using probabilistic graphical model.使用概率图形模型对危重症儿童的脓毒症进行有效诊断。

Transl Pediatr. 2023 Apr 29;12(4):538-551. doi: 10.21037/tp-22-510. Epub 2023 Apr 4.

A Novel Patient Similarity Network (PSN) Framework Based on Multi-Model Deep Learning for Precision Medicine.一种基于多模型深度学习的新型患者相似性网络（PSN）框架用于精准医学。

J Pers Med. 2022 May 10;12(5):768. doi: 10.3390/jpm12050768.

Integrating Multimodal Electronic Health Records for Diagnosis Prediction.整合多模态电子健康记录进行诊断预测。

AMIA Annu Symp Proc. 2022 Feb 21;2021:726-735. eCollection 2021.

Machine Learning in Healthcare.医疗保健中的机器学习

Curr Genomics. 2021 Dec 16;22(4):291-300. doi: 10.2174/1389202922666210705124359.

A weighted patient network-based framework for predicting chronic diseases using graph neural networks.基于加权患者网络的图神经网络预测慢性疾病的框架。

Sci Rep. 2021 Nov 19;11(1):22607. doi: 10.1038/s41598-021-01964-2.

GRAM: Graph-based Attention Model for Healthcare Representation Learning.GRAM：用于医疗保健表示学习的基于图的注意力模型。

KDD. 2017 Aug;2017:787-795. doi: 10.1145/3097983.3098126.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Med-MGF：基于多层次图的医学数据不平衡和表示处理框架。

Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献