Department of Automation, Tsinghua University, Beijing, 100084, China.
Department of Gastroenterology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China.
BMC Med Inform Decis Mak. 2023 Aug 15;23(1):160. doi: 10.1186/s12911-023-02257-6.
Differentiating between Crohn's disease (CD) and intestinal tuberculosis (ITB) with endoscopy is challenging. We aim to perform more accurate endoscopic diagnosis between CD and ITB by building a trustworthy AI differential diagnosis application.
A total of 1271 electronic health record (EHR) patients who had undergone colonoscopies at Peking Union Medical College Hospital (PUMCH) and were clinically diagnosed with CD (n = 875) or ITB (n = 396) were used in this study. We build a workflow to make diagnoses with EHRs and mine differential diagnosis features; this involves finetuning the pretrained language models, distilling them into a light and efficient TextCNN model, interpreting the neural network and selecting differential attribution features, and then adopting manual feature checking and carrying out debias training.
The accuracy of debiased TextCNN on differential diagnosis between CD and ITB is 0.83 (CR F1: 0.87, ITB F1: 0.77), which is the best among the baselines. On the noisy validation set, its accuracy was 0.70 (CR F1: 0.87, ITB: 0.69), which was significantly higher than that of models without debias. We also find that the debiased model more easily mines the diagnostically significant features. The debiased TextCNN unearthed 39 diagnostic features in the form of phrases, 17 of which were key diagnostic features recognized by the guidelines.
We build a trustworthy AI differential diagnosis application for differentiating between CD and ITB focusing on accuracy, interpretability and robustness. The classifiers perform well, and the features which had statistical significance were in agreement with clinical guidelines.
通过内镜鉴别克罗恩病(CD)和肠结核(ITB)具有挑战性。我们旨在通过构建一个值得信赖的人工智能鉴别诊断应用程序,更准确地进行内镜诊断 CD 和 ITB。
本研究共纳入 1271 名在北京协和医院接受结肠镜检查且临床诊断为 CD(n=875)或 ITB(n=396)的电子病历(EHR)患者。我们构建了一个使用 EHR 进行诊断并挖掘鉴别诊断特征的工作流程;这包括微调预训练语言模型,将其提炼成轻量级高效的 TextCNN 模型,解释神经网络并选择鉴别归因特征,然后采用手动特征检查和进行去偏训练。
去偏 TextCNN 在 CD 和 ITB 鉴别诊断中的准确率为 0.83(CR F1:0.87,ITB F1:0.77),优于基线模型。在嘈杂的验证集上,其准确率为 0.70(CR F1:0.87,ITB:0.69),明显高于未去偏的模型。我们还发现,去偏模型更容易挖掘出具有诊断意义的特征。去偏 TextCNN 以短语形式挖掘出 39 个诊断特征,其中 17 个是指南认可的关键诊断特征。
我们专注于准确性、可解释性和稳健性,构建了一个值得信赖的人工智能鉴别诊断应用程序,用于鉴别 CD 和 ITB。分类器性能良好,具有统计学意义的特征与临床指南一致。