Suppr超能文献

一种用于早期识别乳糜泻自身免疫性的机器学习工具。

A machine learning tool for early identification of celiac disease autoimmunity.

作者信息

Dreyfuss Michael, Getz Benjamin, Lebwohl Benjamin, Ramni Or, Underberger Daniel, Ber Tahel Ilan, Steinberg-Koch Shlomit, Jenudi Yonatan, Gazit Sivan, Patalon Tal, Chodick Gabriel, Shoenfeld Yehuda, Ben-Tov Amir

机构信息

Predicta Med Analytics Ltd., Ramat Gan, Israel.

Celiac Disease Center, Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA.

出版信息

Sci Rep. 2024 Dec 28;14(1):30760. doi: 10.1038/s41598-024-80817-0.

Abstract

Identifying which patients should undergo serologic screening for celiac disease (CD) may help diagnose patients who otherwise often experience diagnostic delays or remain undiagnosed. Using anonymized outpatient data from the electronic medical records of Maccabi Healthcare Services, we developed and evaluated five machine learning models to classify patients as at-risk for CD autoimmunity prior to first documented diagnosis or positive serum tissue transglutaminase (tTG-IgA). A train set of highly seropositive (tTG-IgA > 10X ULN) cases (n = 677) with likely CD and controls (n = 176,293) with no evidence of CD autoimmunity was used for model development. Input features included demographic information and commonly available laboratory results. The models were then evaluated for discriminative ability as measured by AUC on a distinct set of highly seropositive cases (n = 153) and controls (n = 41,087). The highest performing model was XGBoost (AUC = 0.86), followed by logistic regression (AUC = 0.85), random forest (AUC = 0.83), multilayer perceptron (AUC = 0.80) and decision tree (AUC = 0.77). Contributing features for the XGBoost model for classifying a patient as at-risk for undiagnosed CD autoimmunity included signs of anemia, transaminitis and decreased high-density lipoprotein. This model's ability to distinguish cases of incident CD autoimmunity from controls shows promise as a potential clinical tool to identify patients with increased risk of having undiagnosed celiac disease in the community, for serologic screening.

摘要

确定哪些患者应接受乳糜泻(CD)的血清学筛查,可能有助于诊断那些否则经常经历诊断延迟或仍未被诊断的患者。利用来自Maccabi医疗服务机构电子病历的匿名门诊数据,我们开发并评估了五个机器学习模型,以在首次记录诊断或血清组织转谷氨酰胺酶(tTG-IgA)呈阳性之前,将患者分类为有CD自身免疫风险。一组高度血清阳性(tTG-IgA > 10倍ULN)且可能患有CD的病例(n = 677)以及无CD自身免疫证据的对照(n = 176,293)用于模型开发。输入特征包括人口统计学信息和常用的实验室检查结果。然后在一组不同的高度血清阳性病例(n = 153)和对照(n = 41,087)上,根据AUC评估模型的判别能力。表现最佳的模型是XGBoost(AUC = 0.86),其次是逻辑回归(AUC = 0.85)、随机森林(AUC = 0.83)、多层感知器(AUC = 0.80)和决策树(AUC = 0.77)。XGBoost模型将患者分类为未诊断的CD自身免疫风险的贡献特征包括贫血迹象、转氨酶升高和高密度脂蛋白降低。该模型区分新发CD自身免疫病例与对照的能力显示出有望成为一种潜在的临床工具,用于识别社区中未诊断乳糜泻风险增加的患者,以进行血清学筛查。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/959e/11681168/db48c0a56c7b/41598_2024_80817_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验