Department of Infectious Diseases, Hunan Key Laboratory of Viral Hepatitis, Xiangya Hospital, Central South University, Changsha, 410013, China.
Department of Gastroenterology, The Second Xiangya Hospital of Central South University, Changsha, 410008, China.
Sci Rep. 2022 Jul 5;12(1):11340. doi: 10.1038/s41598-022-15609-5.
In countries with a high incidence of tuberculosis, the typical clinical features of Crohn's disease (CD) may be covered up after tuberculosis infection, and the identification of atypical Crohn's disease and intestinal tuberculosis (ITB) is still a dilemma for clinicians. Least absolute shrinkage and selection operator (LASSO) regression has been applied to select variables in disease diagnosis. However, its value in discriminating ITB and atypical Crohn's disease remains unknown. A total of 400 patients were enrolled from January 2014 to January 2019 in second Xiangya hospital Central South University.Among them, 57 indicators including clinical manifestations, laboratory results, endoscopic findings, computed tomography enterography features were collected for further analysis. R software version 3.6.1 (glmnet package) was used to perform the LASSO logistic regression analysis. SPSS 20.0 was used to perform Pearson chi-square test and binary logistic regression analysis. In the variable selection step, LASSO regression and Pearson chi-square test were applied to select the most valuable variables as candidates for further logistic regression analysis. Secondly, variables identified from step 1 were applied to construct binary logistic regression analysis. Receiver operating characteristic (ROC) curve analysis was performed on these models to assess the ability and the optimal cutoff value for diagnosis. The area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy rate, together with their 95% confidence and intervals (CIs) were calculated. MedCalc software (Version 16.8) was applied to analyze the ROC curves of models. 332 patients were eventually enrolled to build a binary logistic regression model to discriminate CD (including comprehensive CD and tuberculosis infected CD) and ITB. However, we did not get a satisfactory diagnostic value via applying the binary logistic regression model of comprehensive CD and ITB to predict tuberculosis infected CD and ITB (accuracy rate:79.2%VS 65.1%). Therefore, we further established a binary logistic regression model to discriminate atypical CD from ITB, based on Pearsonchi-square test (model1) and LASSO regression (model 2). Model 1 showed 89.9% specificity, 65.9% sensitivity, 88.5% PPV, 68.9% NPV, 76.9% diagnostic accuracy, and an AUC value of 0.811, and model 2 showed 80.6% specificity, 84.4% sensitivity, 82.3% PPV, 82.9% NPV, 82.6% diagnostic accuracy, and an AUC value of 0.887. The comparison of AUCs between model1 and model2 was statistically different (P < 0.05). Tuberculosis infection increases the difficulty of discriminating CD from ITB. LASSO regression showed a more efficient ability than Pearson chi-square test based logistic regression on differential diagnosing atypical CD and ITB.
在结核病高发的国家,结核病感染可能会掩盖克罗恩病(CD)的典型临床特征,而鉴别不典型 CD 和肠结核(ITB)仍然是临床医生面临的难题。最小绝对收缩和选择算子(LASSO)回归已被应用于疾病诊断中的变量选择。然而,其在鉴别 ITB 和不典型 CD 中的价值尚不清楚。2014 年 1 月至 2019 年 1 月,我们从中南大学湘雅二医院共招募了 400 名患者。其中,收集了包括临床表现、实验室结果、内镜检查结果、计算机断层肠造影特征在内的 57 项指标进行进一步分析。使用 R 软件版本 3.6.1(glmnet 包)进行 LASSO 逻辑回归分析。SPSS 20.0 用于进行 Pearson 卡方检验和二项逻辑回归分析。在变量选择步骤中,LASSO 回归和 Pearson 卡方检验用于选择最有价值的变量作为进一步逻辑回归分析的候选变量。其次,从步骤 1 中识别出的变量被用于构建二项逻辑回归分析。对这些模型进行接收者操作特征(ROC)曲线分析,以评估诊断能力和最佳截断值。计算 ROC 曲线下面积(AUC)、敏感度、特异度、阳性预测值(PPV)、阴性预测值(NPV)、准确率,及其 95%置信区间(CI)。MedCalc 软件(版本 16.8)用于分析模型的 ROC 曲线。最终纳入 332 名患者建立二元逻辑回归模型以鉴别 CD(包括综合 CD 和结核感染 CD)和 ITB。然而,我们通过应用综合 CD 和 ITB 的二元逻辑回归模型来预测结核感染 CD 和 ITB 并没有得到令人满意的诊断价值(准确率:79.2%VS 65.1%)。因此,我们进一步建立了一个基于 Pearson 卡方检验(模型 1)和 LASSO 回归(模型 2)的二元逻辑回归模型来鉴别不典型 CD 和 ITB。模型 1 显示 89.9%的特异性、65.9%的灵敏度、88.5%的 PPV、68.9%的 NPV、76.9%的诊断准确率和 AUC 值为 0.811,模型 2 显示 80.6%的特异性、84.4%的灵敏度、82.3%的 PPV、82.9%的 NPV、82.6%的诊断准确率和 AUC 值为 0.887。模型 1 和模型 2 的 AUC 比较有统计学差异(P<0.05)。结核病感染增加了鉴别 CD 和 ITB 的难度。LASSO 回归在鉴别不典型 CD 和 ITB 方面显示出比基于 Pearson 卡方检验的逻辑回归更有效的能力。