利用可解释的机器学习方法从体检中识别 2 型糖尿病的诊断指标。

Identifying diagnostic indicators for type 2 diabetes mellitus from physical examination using interpretable machine learning approach.

机构信息

College of Chemistry, Sichuan University, Chengdu, China.

Basic Medical College, Southwest Medical University, Luzhou, China.

出版信息

Front Endocrinol (Lausanne). 2024 Mar 18;15:1376220. doi: 10.3389/fendo.2024.1376220. eCollection 2024.

Abstract

BACKGROUND

Identification of patients at risk for type 2 diabetes mellitus (T2DM) can not only prevent complications and reduce suffering but also ease the health care burden. While routine physical examination can provide useful information for diagnosis, manual exploration of routine physical examination records is not feasible due to the high prevalence of T2DM.

OBJECTIVES

We aim to build interpretable machine learning models for T2DM diagnosis and uncover important diagnostic indicators from physical examination, including age- and sex-related indicators.

METHODS

In this study, we present three weighted diversity density (WDD)-based algorithms for T2DM screening that use physical examination indicators, the algorithms are highly transparent and interpretable, two of which are missing value tolerant algorithms.

PATIENTS

Regarding the dataset, we collected 43 physical examination indicator data from 11,071 cases of T2DM patients and 126,622 healthy controls at the Affiliated Hospital of Southwest Medical University. After data processing, we used a data matrix containing 16004 EHRs and 43 clinical indicators for modelling.

RESULTS

The indicators were ranked according to their model weights, and the top 25% of indicators were found to be directly or indirectly related to T2DM. We further investigated the clinical characteristics of different age and sex groups, and found that the algorithms can detect relevant indicators specific to these groups. The algorithms performed well in T2DM screening, with the highest area under the receiver operating characteristic curve (AUC) reaching 0.9185.

CONCLUSION

This work utilized the interpretable WDD-based algorithms to construct T2DM diagnostic models based on physical examination indicators. By modeling data grouped by age and sex, we identified several predictive markers related to age and sex, uncovering characteristic differences among various groups of T2DM patients.

摘要

背景

识别出患有 2 型糖尿病(T2DM)的高危患者,不仅可以预防并发症和减轻痛苦,还可以减轻医疗保健负担。虽然常规体检可以提供有用的诊断信息,但由于 T2DM 的高发病率,手动探索常规体检记录是不可行的。

目的

我们旨在建立用于 T2DM 诊断的可解释机器学习模型,并从体检中发现重要的诊断指标,包括与年龄和性别相关的指标。

方法

在这项研究中,我们提出了三种基于加权差异密度(WDD)的 T2DM 筛查算法,这些算法使用体检指标,具有高度的透明性和可解释性,其中两种是容忍缺失值的算法。

患者

关于数据集,我们从西南医科大学附属医院收集了 11071 例 T2DM 患者和 126622 例健康对照者的 43 项体检指标数据。在数据处理后,我们使用包含 16004 份电子健康记录和 43 项临床指标的数据矩阵进行建模。

结果

根据模型权重对指标进行了排名,发现前 25%的指标与 T2DM 直接或间接相关。我们进一步研究了不同年龄和性别组的临床特征,发现这些算法可以检测出与这些组相关的特定指标。这些算法在 T2DM 筛查中表现良好,最高的受试者工作特征曲线下面积(AUC)达到 0.9185。

结论

这项工作利用可解释的基于 WDD 的算法,根据体检指标构建 T2DM 诊断模型。通过对按年龄和性别分组的数据进行建模,我们确定了一些与年龄和性别相关的预测标志物,揭示了不同 T2DM 患者群体之间的特征差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7e7/10982324/adf6ba9eda62/fendo-15-1376220-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索