Neural Injury Group, Nuffield Department of Clinical Neuroscience, John Radcliffe Hospital, University of Oxford, Level 6, West Wing, Oxford, OX3 9DU, UK.
Chronic Pain Research Group, Division of Population Health and Genomics, Mackenzie Building, Ninewells Hospital and Medical School, University of Dundee, Dundee, UK.
BMC Med Inform Decis Mak. 2022 May 29;22(1):144. doi: 10.1186/s12911-022-01890-x.
To improve the treatment of painful Diabetic Peripheral Neuropathy (DPN) and associated co-morbidities, a better understanding of the pathophysiology and risk factors for painful DPN is required. Using harmonised cohorts (N = 1230) we have built models that classify painful versus painless DPN using quality of life (EQ5D), lifestyle (smoking, alcohol consumption), demographics (age, gender), personality and psychology traits (anxiety, depression, personality traits), biochemical (HbA1c) and clinical variables (BMI, hospital stay and trauma at young age) as predictors.
The Random Forest, Adaptive Regression Splines and Naive Bayes machine learning models were trained for classifying painful/painless DPN. Their performance was estimated using cross-validation in large cross-sectional cohorts (N = 935) and externally validated in a large population-based cohort (N = 295). Variables were ranked for importance using model specific metrics and marginal effects of predictors were aggregated and assessed at the global level. Model selection was carried out using the Mathews Correlation Coefficient (MCC) and model performance was quantified in the validation set using MCC, the area under the precision/recall curve (AUPRC) and accuracy.
Random Forest (MCC = 0.28, AUPRC = 0.76) and Adaptive Regression Splines (MCC = 0.29, AUPRC = 0.77) were the best performing models and showed the smallest reduction in performance between the training and validation dataset. EQ5D index, the 10-item personality dimensions, HbA1c, Depression and Anxiety t-scores, age and Body Mass Index were consistently amongst the most powerful predictors in classifying painful vs painless DPN.
Machine learning models trained on large cross-sectional cohorts were able to accurately classify painful or painless DPN on an independent population-based dataset. Painful DPN is associated with more depression, anxiety and certain personality traits. It is also associated with poorer self-reported quality of life, younger age, poor glucose control and high Body Mass Index (BMI). The models showed good performance in realistic conditions in the presence of missing values and noisy datasets. These models can be used either in the clinical context to assist patient stratification based on the risk of painful DPN or return broad risk categories based on user input. Model's performance and calibration suggest that in both cases they could potentially improve diagnosis and outcomes by changing modifiable factors like BMI and HbA1c control and institute earlier preventive or supportive measures like psychological interventions.
为了改善疼痛性糖尿病周围神经病变(DPN)及其相关合并症的治疗效果,我们需要更好地了解疼痛性 DPN 的病理生理学和危险因素。使用经过协调的队列(N=1230),我们构建了使用生活质量(EQ5D)、生活方式(吸烟、饮酒)、人口统计学(年龄、性别)、个性和心理学特征(焦虑、抑郁、个性特征)、生化(HbA1c)和临床变量(BMI、住院时间和年轻时的创伤)作为预测因子来区分疼痛性与无痛性 DPN 的模型。
随机森林、自适应回归样条和朴素贝叶斯机器学习模型被用于对疼痛性/无痛性 DPN 进行分类。我们使用大型横断面队列(N=935)中的交叉验证来估计它们的性能,并在大型基于人群的队列(N=295)中进行外部验证。使用模型特定的指标对变量进行重要性排序,并汇总预测因子的边际效应,然后在全球范围内进行评估。使用马修斯相关系数(MCC)进行模型选择,并使用 MCC、精度/召回曲线下的面积(AUPRC)和准确性在验证集中量化模型性能。
随机森林(MCC=0.28,AUPRC=0.76)和自适应回归样条(MCC=0.29,AUPRC=0.77)是表现最佳的模型,并且在训练数据集和验证数据集之间的性能下降最小。EQ5D 指数、10 项人格维度、HbA1c、抑郁和焦虑 t 分数、年龄和体重指数(BMI)始终是区分疼痛性与无痛性 DPN 的最有力预测因子之一。
基于大型横断面队列训练的机器学习模型能够在独立的基于人群的数据集上准确地对疼痛性或无痛性 DPN 进行分类。疼痛性 DPN 与更多的抑郁、焦虑和某些人格特征有关。它还与较差的自我报告生活质量、较年轻的年龄、较差的血糖控制和较高的 BMI 有关。这些模型在存在缺失值和嘈杂数据集的现实条件下表现出良好的性能。这些模型可用于临床环境,根据疼痛性 DPN 的风险对患者进行分层,或者根据用户输入返回广泛的风险类别。模型的性能和校准表明,在这两种情况下,它们都有可能通过改变 BMI 和 HbA1c 控制等可改变的因素,并采取早期预防或支持性措施,如心理干预,从而改善诊断和结果。