纳入区域健康社会决定因素的 2 型糖尿病失控预测模型。

A Prediction Model for Uncontrolled Type 2 Diabetes Mellitus Incorporating Area-level Social Determinants of Health.

机构信息

Research and Analytics, Collective Health, San Francisco, CA.

Center for Primary Care, Harvard Medical School, Boston, MA.

出版信息

Med Care. 2019 Aug;57(8):592-600. doi: 10.1097/MLR.0000000000001147.

DOI:10.1097/MLR.0000000000001147

PMID:31268954

Abstract

BACKGROUND

Social determinants of health (SDH) at the area level are understood to influence the likelihood of having poor glycemic control for patients with type 2 diabetes mellitus (T2DM).

OBJECTIVES

To develop a model for predicting whether a person with T2DM has uncontrolled diabetes (hemoglobin A1c ≥9%), incorporating individual and area-level (census tract) covariates.

RESEARCH DESIGN

Development and validation of machine learning models.

SUBJECTS

Total of N=1,015,808 privately insured persons in claims data with T2DM.

MEASURES

C-statistic, sensitivity, specificity, positive predictive value, negative predictive value, and accuracy.

RESULTS

A standard logistic regression model selecting among the available individual-level covariates and area-level SDH covariates (at the census tract level) performed poorly, with a C-statistic of 0.685, sensitivity of 25.6%, specificity of 90.1%, positive predictive value of 56.9%, negative predictive value of 70.4%, and accuracy of 68.4% on a 25% held-out validation subset of the data. By contrast, machine learning models improved upon risk prediction, with the highest performance from a random forest algorithm with a C-statistic of 0.928, sensitivity of 68.5%, specificity of 94.6%, positive predictive value of 69.8%, negative predictive value of 94.3%, and accuracy of 90.6%. SDH variables alone explained 16.9% of variation in uncontrolled diabetes.

CONCLUSIONS

A predictive model developed through a machine learning approach may assist health care organizations to identify which area-level SDH data to monitor for prediction of diabetes control, for potential use in risk-adjustment and targeting.

摘要

背景

区域层面的健康社会决定因素（SDH）被认为会影响 2 型糖尿病（T2DM）患者血糖控制不佳的可能性。

目的

开发一种预测 T2DM 患者是否患有未控制糖尿病（糖化血红蛋白≥9%）的模型，纳入个体和区域（普查区）水平的协变量。

研究设计

机器学习模型的开发和验证。

研究对象

来自索赔数据的共 1,015,808 名有 T2DM 的私人保险患者。

测量

C 统计量、敏感性、特异性、阳性预测值、阴性预测值和准确性。

结果

一个标准的逻辑回归模型，在可用的个体水平协变量和区域水平 SDH 协变量（在普查区水平）中进行选择，表现不佳，C 统计量为 0.685，敏感性为 25.6%，特异性为 90.1%，阳性预测值为 56.9%，阴性预测值为 70.4%，准确性为 68.4%，在数据的 25%保留验证子集中。相比之下，机器学习模型提高了风险预测的性能，其中表现最好的是随机森林算法，C 统计量为 0.928，敏感性为 68.5%，特异性为 94.6%，阳性预测值为 69.8%，阴性预测值为 94.3%，准确性为 90.6%。SDH 变量单独解释了 16.9%的未控制糖尿病的变异性。