Suppr超能文献

鉴别胶质肉瘤与胶质母细胞瘤:一种使用PEACE和XGBoost处理具有超高维混杂因素数据集的新方法。

Differentiating Gliosarcoma from Glioblastoma: A Novel Approach Using PEACE and XGBoost to Deal with Datasets with Ultra-High Dimensional Confounders.

作者信息

Saki Amir, Faghihi Usef, Baldé Ismaila

机构信息

Département de Mathématiques et d'Informatique, Université du Québec à Trois-Rivières, Trois-Rivières, QC G8Z 4M3, Canada.

Département de Mathématiques et de Statistique, Faculté des Sciences, Université de Moncton, Moncton, NB E1A3E9, Canada.

出版信息

Life (Basel). 2024 Jul 16;14(7):882. doi: 10.3390/life14070882.

Abstract

In this study, we used a recently developed causal methodology, called Probabilistic Easy Variational Causal Effect (PEACE), to distinguish gliosarcoma (GSM) from glioblastoma (GBM). Our approach uses a causal metric which combines Probabilistic Easy Variational Causal Effect (PEACE) with the XGBoost, or eXtreme Gradient Boosting, algorithm. Unlike prior research, which often relied on statistical models to reduce dataset dimensions before causal analysis, our approach uses the complete dataset with PEACE and the XGBoost algorithm. PEACE provides a comprehensive measurement of direct causal effects, applicable to both continuous and discrete variables. Our method provides both positive and negative versions of PEACE together with their averages to calculate the positive and negative causal effects of the radiomic features on the variable representing the type of tumor (GSM or GBM). In our model, PEACE and its variations are equipped with a degree d which varies from 0 to 1 and it reflects the importance of the rarity and frequency of the events. By using PEACE with XGBoost, we achieved a detailed and nuanced understanding of the causal relationships within the dataset features, facilitating accurate differentiation between GSM and GBM. To assess the XGBoost model, we used cross-validation and obtained a mean accuracy of 83% and an average model MSE of 0.130. This performance is notable given the high number of columns and low number of rows (code on GitHub).

摘要

在本研究中,我们使用了一种最近开发的因果方法,称为概率简易变分因果效应(PEACE),以区分胶质肉瘤(GSM)和胶质母细胞瘤(GBM)。我们的方法使用了一种因果度量,它将概率简易变分因果效应(PEACE)与XGBoost(即极端梯度提升)算法相结合。与以往通常依赖统计模型在因果分析前降低数据集维度的研究不同,我们的方法使用包含PEACE和XGBoost算法的完整数据集。PEACE提供了对直接因果效应的全面度量,适用于连续变量和离散变量。我们的方法提供了PEACE的正负版本及其平均值,以计算影像组学特征对代表肿瘤类型(GSM或GBM)的变量的正负因果效应。在我们的模型中,PEACE及其变体配备了一个从0到1变化的度数d,它反映了事件的稀有性和频率的重要性。通过将PEACE与XGBoost结合使用,我们对数据集中特征之间的因果关系有了详细而细致的理解,有助于准确区分GSM和GBM。为了评估XGBoost模型,我们使用了交叉验证,获得了83%的平均准确率和0.130的平均模型均方误差。考虑到列数多而行数少的情况(代码在GitHub上),这种性能值得注意。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4b7/11278037/298643ab4625/life-14-00882-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验