Suppr超能文献

Med-MGF:基于多层次图的医学数据不平衡和表示处理框架。

Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation.

机构信息

Department of Industrial Engineering and Management, National University of Singapore, Singapore, 117576, Singapore.

Children's Emergency, KK Women's and Children's Hospital, Singapore, 229899, Singapore.

出版信息

BMC Med Inform Decis Mak. 2024 Sep 2;24(1):242. doi: 10.1186/s12911-024-02649-2.

Abstract

BACKGROUND

Modeling patient data, particularly electronic health records (EHR), is one of the major focuses of machine learning studies in healthcare, as these records provide clinicians with valuable information that can potentially assist them in disease diagnosis and decision-making.

METHODS

In this study, we present a multi-level graph-based framework called MedMGF, which models both patient medical profiles extracted from EHR data and their relationship network of health profiles in a single architecture. The medical profiles consist of several layers of data embedding derived from interval records obtained during hospitalization, and the patient-patient network is created by measuring the similarities between these profiles. We also propose a modification to the Focal Loss (FL) function to improve classification performance in imbalanced datasets without the need to imputate the data. MedMGF's performance was evaluated against several Graphical Convolutional Network (GCN) baseline models implemented with Binary Cross Entropy (BCE), FL, class balancing parameter , and Synthetic Minority Oversampling Technique (SMOTE).

RESULTS

Our proposed framework achieved high classification performance (AUC: 0.8098, ACC: 0.7503, SEN: 0.8750, SPE: 0.7445, NPV: 0.9923, PPV: 0.1367) on an extreme imbalanced pediatric sepsis dataset (n=3,014, imbalance ratio of 0.047). It yielded a classification improvement of 3.81% for AUC, 15% for SEN compared to the baseline GCN+ FL (AUC: 0.7717, ACC: 0.8144, SEN: 0.7250, SPE: 0.8185, PPV: 0.1559, NPV: 0.9847), and an improvement of 5.88% in AUC and 22.5% compared to GCN+FL+SMOTE (AUC: 0.7510, ACC: 0.8431, SEN: 0.6500, SPE: 0.8520, PPV: 0.1688, NPV: 0.9814). It also showed a classification improvement of 3.86% for AUC, 15% for SEN compared to the baseline GCN+ BCE (AUC: 0.7712, ACC: 0.8133, SEN: 0.7250, SPE: 0.8173, PPV: 0.1551, NPV: 0.9847), and an improvement of 14.33% in AUC and 27.5% in comparison to GCN+BCE+SMOTE (AUC: 0.6665, ACC: 0.7271, SEN: 0.6000, SPE: 0.7329, PPV: 0.0941, NPV: 0.9754).

CONCLUSION

When compared to all baseline models, MedMGF achieved the highest SEN and AUC results, demonstrating the potential for several healthcare applications.

摘要

背景

对患者数据(尤其是电子健康记录)进行建模是医疗保健领域机器学习研究的主要关注点之一,因为这些记录为临床医生提供了有价值的信息,可能有助于他们进行疾病诊断和决策。

方法

在本研究中,我们提出了一种称为 MedMGF 的多级图基框架,该框架在单个架构中同时对从电子健康记录数据中提取的患者医疗档案和他们的健康档案关系网络进行建模。医疗档案由住院期间获得的几个数据嵌入层组成,患者患者网络通过测量这些档案之间的相似性来创建。我们还提出了对焦点损失(FL)函数的修改,以在不需要输入数据的情况下提高不平衡数据集的分类性能。我们将 MedMGF 的性能与几种基于图形卷积网络(GCN)的基线模型进行了比较,这些基线模型使用二进制交叉熵(BCE)、FL、类别平衡参数和合成少数过采样技术(SMOTE)。

结果

我们提出的框架在一个极端不平衡的儿科败血症数据集(n=3014,不平衡率为 0.047)上实现了较高的分类性能(AUC:0.8098、ACC:0.7503、SEN:0.8750、SPE:0.7445、NPV:0.9923、PPV:0.1367)。与基线 GCN+FL 相比,AUC 提高了 3.81%,SEN 提高了 15%(AUC:0.7717、ACC:0.8144、SEN:0.7250、SPE:0.8185、PPV:0.1559、NPV:0.9847),与 GCN+FL+SMOTE 相比,AUC 提高了 5.88%,SEN 提高了 22.5%(AUC:0.7510、ACC:0.8431、SEN:0.6500、SPE:0.8520、PPV:0.1688、NPV:0.9814)。与基线 GCN+BCE 相比,AUC 提高了 3.86%,SEN 提高了 15%(AUC:0.7712、ACC:0.8133、SEN:0.7250、SPE:0.8173、PPV:0.1551、NPV:0.9847),与 GCN+BCE+SMOTE 相比,AUC 提高了 14.33%,SEN 提高了 27.5%(AUC:0.6665、ACC:0.7271、SEN:0.6000、SPE:0.7329、PPV:0.0941、NPV:0.9754)。

结论

与所有基线模型相比,MedMGF 实现了最高的 SEN 和 AUC 结果,表明其在多个医疗保健应用中有潜力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验