Suppr超能文献

基于 XGBoost 和多组学数据的肾脏透明细胞肾细胞癌转录风险评分。

The transcriptional risk scores for kidney renal clear cell carcinoma using XGBoost and multiple omics data.

机构信息

School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China.

Physical Department of Science and Technology, Dalian University, Dalian 116622, China.

出版信息

Math Biosci Eng. 2023 May 8;20(7):11676-11687. doi: 10.3934/mbe.2023519.

Abstract

Most kidney cancers are kidney renal clear cell carcinoma (KIRC) that is a main cause of cancer-related deaths. Polygenic risk score (PRS) is a weighted linear combination of phenotypic related alleles on the genome that can be used to assess KIRC risk. However, standalone SNP data as input to the PRS model may not provide satisfactory result. Therefore, Transcriptional risk scores (TRS) based on multi-omics data and machine learning models were proposed to assess the risk of KIRC. First, we collected four types of multi-omics data (DNA methylation, miRNA, mRNA and lncRNA) of KIRC patients from the TCGA database. Subsequently, a novel TRS method utilizing multiple omics data and XGBoost model was developed. Finally, we performed prevalence analysis and prognosis prediction to evaluate the utility of the TRS generated by our method. Our TRS methods exhibited better predictive performance than the linear models and other machine learning models. Furthermore, the prediction accuracy of combined TRS model was higher than that of single-omics TRS model. The KM curves showed that TRS was a valid prognostic indicator for cancer staging. Our proposed method extended the current definition of TRS from standalone SNP data to multi-omics data and was superior to the linear models and other machine learning models, which may provide a useful implement for diagnostic and prognostic prediction of KIRC.

摘要

大多数肾癌是肾透明细胞癌(KIRC),是癌症相关死亡的主要原因。多基因风险评分(PRS)是基因组上与表型相关的等位基因的加权线性组合,可用于评估 KIRC 风险。然而,作为 PRS 模型输入的独立 SNP 数据可能无法提供令人满意的结果。因此,提出了基于多组学数据和机器学习模型的转录风险评分(TRS)来评估 KIRC 的风险。首先,我们从 TCGA 数据库中收集了 KIRC 患者的四种多组学数据(DNA 甲基化、miRNA、mRNA 和 lncRNA)。随后,开发了一种利用多组学数据和 XGBoost 模型的新型 TRS 方法。最后,我们进行了患病率分析和预后预测,以评估我们方法生成的 TRS 的实用性。我们的 TRS 方法比线性模型和其他机器学习模型具有更好的预测性能。此外,组合 TRS 模型的预测准确性高于单组学 TRS 模型。KM 曲线表明,TRS 是癌症分期的有效预后指标。我们提出的方法将当前的 TRS 定义从独立的 SNP 数据扩展到了多组学数据,并且优于线性模型和其他机器学习模型,这可能为 KIRC 的诊断和预后预测提供了有用的工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验