Suppr超能文献

整合蛋白质组学与可解释人工智能:子宫内膜癌诊断和预后蛋白质生物标志物的综合分析

Integrating proteomics and explainable artificial intelligence: a comprehensive analysis of protein biomarkers for endometrial cancer diagnosis and prognosis.

作者信息

Yasar Seyma, Yagin Fatma Hilal, Melekoglu Rauf, Ardigò Luca Paolo

机构信息

Department of Biostatistics, and Medical Informatics, Medicine Faculty, Inonu University, Malatya, Türkiye.

Department of Obstetrics and Gynecology, Faculty of Medicine, Inonu University, Malatya, Türkiye.

出版信息

Front Mol Biosci. 2024 Jun 3;11:1389325. doi: 10.3389/fmolb.2024.1389325. eCollection 2024.

Abstract

Endometrial cancer, which is the most common gynaecological cancer in women after breast, colorectal and lung cancer, can be diagnosed at an early stage. The first aim of this study is to classify age, tumor grade, myometrial invasion and tumor size, which play an important role in the diagnosis and prognosis of endometrial cancer, with machine learning methods combined with explainable artificial intelligence. 20 endometrial cancer patients proteomic data obtained from tumor biopsies taken from different regions of EC tissue were used. The data obtained were then classified according to age, tumor size, tumor grade and myometrial invasion. Then, by using three different machine learning methods, explainable artificial intelligence was applied to the model that best classifies these groups and possible protein biomarkers that can be used in endometrial prognosis were evaluated. The optimal model for age classification was XGBoost with AUC (98.8%), for tumor grade classification was XGBoost with AUC (98.6%), for myometrial invasion classification was LightGBM with AUC (95.1%), and finally for tumor size classification was XGBoost with AUC (94.8%). By combining the optimal models and the SHAP approach, possible protein biomarkers and their expressions were obtained for classification. Finally, EWRS1 protein was found to be common in three groups (age, myometrial invasion, tumor size). This article's findings indicate that models have been developed that can accurately classify factors including age, tumor grade, and myometrial invasion all of which are critical for determining the prognosis of endometrial cancer as well as potential protein biomarkers associated with these factors. Furthermore, we were able to provide an analysis of how the quantities of the proteins suggested as biomarkers varied throughout the classes by combining the SHAP values with these ideal models.

摘要

子宫内膜癌是女性继乳腺癌、结直肠癌和肺癌之后最常见的妇科癌症,能够在早期被诊断出来。本研究的首要目标是运用机器学习方法并结合可解释人工智能,对年龄、肿瘤分级、肌层浸润和肿瘤大小进行分类,这些因素在子宫内膜癌的诊断和预后中起着重要作用。我们使用了从20例子宫内膜癌患者的肿瘤活检中获取的蛋白质组数据,这些活检取自子宫内膜癌组织的不同区域。然后,根据年龄、肿瘤大小、肿瘤分级和肌层浸润对所获数据进行分类。接着,通过使用三种不同的机器学习方法,将可解释人工智能应用于对这些组进行最佳分类的模型,并评估了可用于子宫内膜癌预后的潜在蛋白质生物标志物。年龄分类的最佳模型是AUC为98.8%的XGBoost,肿瘤分级分类的最佳模型是AUC为98.6%的XGBoost,肌层浸润分类的最佳模型是AUC为95.1%的LightGBM,最后肿瘤大小分类的最佳模型是AUC为94.8%的XGBoost。通过结合最佳模型和SHAP方法,获得了用于分类的潜在蛋白质生物标志物及其表达。最后,发现EWRS1蛋白在三组(年龄、肌层浸润、肿瘤大小)中都存在。本文的研究结果表明,已经开发出能够准确分类年龄、肿瘤分级和肌层浸润等因素的模型,所有这些因素对于确定子宫内膜癌的预后以及与这些因素相关的潜在蛋白质生物标志物都至关重要。此外,通过将SHAP值与这些理想模型相结合,我们能够分析作为生物标志物提出的蛋白质数量在不同类别中是如何变化的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dbf/11184912/afc61af5d255/fmolb-11-1389325-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验