Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, Hong Kong.
Department of Medicine, University of Toronto, Toronto, Canada.
BMC Med Res Methodol. 2020 Feb 24;20(1):35. doi: 10.1186/s12874-020-00921-3.
Validated algorithms to classify type 1 and 2 diabetes (T1D, T2D) are mostly limited to white pediatric populations. We conducted a large study in Hong Kong among children and adults with diabetes to develop and validate algorithms using electronic health records (EHRs) to classify diabetes type against clinical assessment as the reference standard, and to evaluate performance by age at diagnosis.
We included all people with diabetes (age at diagnosis 1.5-100 years during 2002-15) in the Hong Kong Diabetes Register and randomized them to derivation and validation cohorts. We developed candidate algorithms to identify diabetes types using encounter codes, prescriptions, and combinations of these criteria ("combination algorithms"). We identified 3 algorithms with the highest sensitivity, positive predictive value (PPV), and kappa coefficient, and evaluated performance by age at diagnosis in the validation cohort.
There were 10,196 (T1D n = 60, T2D n = 10,136) and 5101 (T1D n = 43, T2D n = 5058) people in the derivation and validation cohorts (mean age at diagnosis 22.7, 55.9 years; 53.3, 43.9% female; for T1D and T2D respectively). Algorithms using codes or prescriptions classified T1D well for age at diagnosis < 20 years, but sensitivity and PPV dropped for older ages at diagnosis. Combination algorithms maximized sensitivity or PPV, but not both. The "high sensitivity for type 1" algorithm (ratio of type 1 to type 2 codes ≥ 4, or at least 1 insulin prescription within 90 days) had a sensitivity of 95.3% (95% confidence interval 84.2-99.4%; PPV 12.8%, 9.3-16.9%), while the "high PPV for type 1" algorithm (ratio of type 1 to type 2 codes ≥ 4, and multiple daily injections with no other glucose-lowering medication prescription) had a PPV of 100.0% (79.4-100.0%; sensitivity 37.2%, 23.0-53.3%), and the "optimized" algorithm (ratio of type 1 to type 2 codes ≥ 4, and at least 1 insulin prescription within 90 days) had a sensitivity of 65.1% (49.1-79.0%) and PPV of 75.7% (58.8-88.2%) across all ages. Accuracy of T2D classification was high for all algorithms.
Our validated set of algorithms accurately classifies T1D and T2D using EHRs for Hong Kong residents enrolled in a diabetes register. The choice of algorithm should be tailored to the unique requirements of each study question.
用于分类 1 型和 2 型糖尿病(T1D、T2D)的有效算法大多仅限于白人儿科人群。我们在香港进行了一项针对儿童和成人糖尿病患者的大型研究,旨在使用电子健康记录(EHR)开发和验证算法,以临床评估为参考标准对糖尿病类型进行分类,并根据诊断时的年龄评估其性能。
我们将 2002-2015 年期间诊断年龄为 1.5-100 岁的所有糖尿病患者(T1D n=60,T2D n=10136)纳入香港糖尿病登记处,并将其随机分配到推导队列和验证队列中。我们使用就诊代码、处方和这些标准的组合(“组合算法”)开发了用于识别糖尿病类型的候选算法。我们确定了 3 种具有最高灵敏度、阳性预测值(PPV)和kappa 系数的算法,并在验证队列中根据诊断时的年龄评估了其性能。
推导队列中有 10196 人(T1D n=60,T2D n=10136)和 5101 人(T1D n=43,T2D n=5058)(平均诊断年龄分别为 22.7 岁和 55.9 岁;53.3%和 43.9%为女性)。对于年龄<20 岁的患者,使用代码或处方的算法可以很好地分类 T1D,但对于年龄较大的患者,灵敏度和 PPV会下降。组合算法最大限度地提高了灵敏度或 PPV,但不能同时提高。“1 型高灵敏度”算法(1 型和 2 型代码的比值≥4,或在 90 天内至少有 1 次胰岛素处方)的灵敏度为 95.3%(95%置信区间 84.2-99.4%;PPV 12.8%,9.3-16.9%),而“1 型高 PPV”算法(1 型和 2 型代码的比值≥4,且多次每日注射,没有其他降血糖药物处方)的 PPV 为 100.0%(79.4-100.0%;灵敏度 37.2%,23.0-53.3%),而“优化”算法(1 型和 2 型代码的比值≥4,且在 90 天内至少有 1 次胰岛素处方)的灵敏度为 65.1%(49.1-79.0%),PPV 为 75.7%(58.8-88.2%)。所有算法对 T2D 分类的准确性均较高。
我们使用 EHR 为参加糖尿病登记处的香港居民验证了一组可准确分类 T1D 和 T2D 的算法。算法的选择应根据每个研究问题的独特要求进行调整。