Suppr超能文献

精准医学与机器学习用于预测潜在乳糜泻的结果

Precision medicine and machine learning towards the prediction of the outcome of potential celiac disease.

作者信息

Piccialli Francesco, Calabrò Francesco, Crisci Danilo, Cuomo Salvatore, Prezioso Edoardo, Mandile Roberta, Troncone Riccardo, Greco Luigi, Auricchio Renata

机构信息

Department of Mathematics and Applications "Renato Caccioppoli", University of Naples "Federico II", Via Cintia, Monte S. Angelo, 80126, Naples, Italy.

Department of Translational Medical Sciences, University of Naples "Federico II", Naples, Italy.

出版信息

Sci Rep. 2021 Mar 11;11(1):5683. doi: 10.1038/s41598-021-84951-x.

Abstract

Potential Celiac Patients (PCD) bear the Celiac Disease (CD) genetic predisposition, a significant production of antihuman transglutaminase antibodies, but no morphological changes in the small bowel mucosa. A minority of patients (17%) showed clinical symptoms and need a gluten free diet at time of diagnosis, while the majority progress over several years (up to a decade) without any clinical problem neither a progression of the small intestine mucosal damage even when they continued to assume gluten in their diet. Recently we developed a traditional multivariate approach to predict the natural history, on the base of the information at enrolment (time 0) by a discriminant analysis model. Still, the traditional multivariate model requires stringent assumptions that may not be answered in the clinical setting. Starting from a follow-up dataset available for PCD, we propose the application of Machine Learning (ML) methodologies to extend the analysis on available clinical data and to detect most influent features predicting the outcome. These features, collected at time of diagnosis, should be capable to classify patients who will develop duodenal atrophy from those who will remain potential. Four ML methods were adopted to select features predictive of the outcome; the feature selection procedure was indeed capable to reduce the number of overall features from 85 to 19. ML methodologies (Random Forests, Extremely Randomized Trees, and Boosted Trees, Logistic Regression) were adopted, obtaining high values of accuracy: all report an accuracy above 75%. The specificity score was always more than 75% also, with two of the considered methods over 98%, while the best performance of sensitivity was 60%. The best model, optimized Boosted Trees, was able to classify PCD starting from the selected 19 features with an accuracy of 0.80, sensitivity of 0.58 and specificity of 0.84. Finally, with this work, we are able to categorize PCD patients that can more likely develop overt CD using ML. ML techniques appear to be an innovative approach to predict the outcome of PCD, since they provide a step forward in the direction of precision medicine aimed to customize healthcare, medical therapies, decisions, and practices tailoring the clinical management of PCD children.

摘要

潜在乳糜泻患者(PCD)具有乳糜泻(CD)的遗传易感性,会大量产生抗人转谷氨酰胺酶抗体,但小肠黏膜无形态学改变。少数患者(17%)在诊断时出现临床症状,需要无麸质饮食,而大多数患者在数年(长达十年)内病情无进展,即使继续摄入含麸质饮食,小肠黏膜损伤也无进展,且无任何临床问题。最近,我们基于入组时(时间0)的信息,通过判别分析模型开发了一种传统的多变量方法来预测自然病程。然而,传统的多变量模型需要严格的假设,而这些假设在临床环境中可能无法满足。基于可用于PCD的随访数据集,我们建议应用机器学习(ML)方法来扩展对现有临床数据的分析,并检测预测结果的最具影响力的特征。这些在诊断时收集的特征应能够将可能发展为十二指肠萎缩的患者与仍为潜在患者的患者区分开来。采用了四种ML方法来选择预测结果的特征;特征选择程序确实能够将总体特征数量从85个减少到19个。采用了ML方法(随机森林、极端随机树、提升树、逻辑回归),获得了较高的准确率:所有方法的准确率均高于75%。特异性评分也始终超过75%,其中两种方法超过98%,而最佳的敏感性表现为60%。最佳模型,即优化的提升树,能够从选定的19个特征开始对PCD进行分类,准确率为0.80,敏感性为0.58,特异性为0.84。最后,通过这项工作,我们能够使用ML对更有可能发展为显性CD的PCD患者进行分类。ML技术似乎是一种预测PCD结果的创新方法,因为它们朝着精准医学的方向迈出了一步,旨在定制医疗保健、医学治疗、决策和实践,以适应PCD儿童的临床管理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ad1/7952550/3fa5d55a0348/41598_2021_84951_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验