• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于特征选择技术 Boruta 的机器学习模型用于结直肠腺癌分类的预后模型开发。

Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta.

机构信息

Department of Biotechnology, Motilal Nehru National Institute of Technology Allahabad, Prayagraj, 211004, India.

National Institute of Animal Biotechnology, Hyderabad, 500032, India.

出版信息

Sci Rep. 2023 Apr 19;13(1):6413. doi: 10.1038/s41598-023-33327-4.

DOI:10.1038/s41598-023-33327-4
PMID:37076536
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10115869/
Abstract

Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.

摘要

结直肠癌(CRC)是第三大常见癌症类型,占全球近 100 万人死亡。分析了来自 TCGA 和 GEO(GSE144259、GSE50760 和 GSE87096)的 CRC mRNA 基因表达数据集,以找到显著差异表达的基因(DEGs)。通过 boruta 对这些显著基因进行进一步的特征选择处理,随后使用基于 ML 的预后分类模型开发来确认重要的特征(基因)。对这些基因进行生存分析,并对最终基因与浸润免疫细胞之间的相关性进行分析。共纳入 770 例 CRC 样本,其中 78 例为正常组织,692 例为肿瘤组织。经过 DESeq2 分析和 topconfects R 包,共鉴定出 170 个显著的 DEGs。基于 RF 预后分类模型的 33 个确认重要特征的准确性、精度、召回率和 F1 得分为 100%,标准偏差为 0%。总体生存分析确定了 GLP2R 和 VSTM2A 基因,这些基因在肿瘤样本中显著下调,与免疫细胞浸润有很强的相关性。根据这些基因的生物学功能和文献分析,进一步证实了它们在 CRC 预后中的作用。目前的研究结果表明,GLP2R 和 VSTM2A 可能在 CRC 的进展和免疫反应抑制中发挥重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/176ccf3cef4b/41598_2023_33327_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/a76fc564a37f/41598_2023_33327_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/89aa0cfdeae2/41598_2023_33327_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/7e2000fc2947/41598_2023_33327_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/48eacc141d82/41598_2023_33327_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/6d942b7e2a44/41598_2023_33327_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/ba0fb9b39c04/41598_2023_33327_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/508dbe997161/41598_2023_33327_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/ba3d60a24284/41598_2023_33327_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/176ccf3cef4b/41598_2023_33327_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/a76fc564a37f/41598_2023_33327_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/89aa0cfdeae2/41598_2023_33327_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/7e2000fc2947/41598_2023_33327_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/48eacc141d82/41598_2023_33327_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/6d942b7e2a44/41598_2023_33327_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/ba0fb9b39c04/41598_2023_33327_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/508dbe997161/41598_2023_33327_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/ba3d60a24284/41598_2023_33327_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/efcf/10115869/176ccf3cef4b/41598_2023_33327_Fig9_HTML.jpg

相似文献

1
Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta.基于特征选择技术 Boruta 的机器学习模型用于结直肠腺癌分类的预后模型开发。
Sci Rep. 2023 Apr 19;13(1):6413. doi: 10.1038/s41598-023-33327-4.
2
Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer.联合机器学习和统计 R 分析的转录组谱分析鉴定 TMEM236 为结直肠癌的潜在新型诊断生物标志物。
Sci Rep. 2021 Jul 12;11(1):14304. doi: 10.1038/s41598-021-92692-0.
3
A Prognostic Model Based on the Immune-related Genes in Colon Adenocarcinoma.一种基于结肠腺癌免疫相关基因的预后模型。
Int J Med Sci. 2020 Jul 19;17(13):1879-1896. doi: 10.7150/ijms.45813. eCollection 2020.
4
Unlocking the Potential of the CA2, CA7, and ITM2C Gene Signatures for the Early Detection of Colorectal Cancer: A Comprehensive Analysis of RNA-Seq Data by Utilizing Machine Learning Algorithms.利用机器学习算法对 RNA-Seq 数据进行综合分析,揭示 CA2、CA7 和 ITM2C 基因标志物在结直肠癌早期检测中的潜力。
Genes (Basel). 2023 Sep 22;14(10):1836. doi: 10.3390/genes14101836.
5
Identification of prognostic immune-related gene signature associated with tumor microenvironment of colorectal cancer.鉴定与结直肠癌肿瘤微环境相关的预后免疫相关基因特征。
BMC Cancer. 2021 Aug 8;21(1):905. doi: 10.1186/s12885-021-08629-3.
6
Construction of a Colorectal Cancer Prognostic Risk Model and Screening of Prognostic Risk Genes Using Machine-Learning Algorithms.构建结直肠癌预后风险模型及应用机器学习算法筛选预后风险基因。
Comput Math Methods Med. 2022 Oct 11;2022:9408839. doi: 10.1155/2022/9408839. eCollection 2022.
7
Identification of Genes Related to Clinicopathological Characteristics and Prognosis of Patients with Colorectal Cancer.鉴定与结直肠癌患者临床病理特征和预后相关的基因。
DNA Cell Biol. 2020 Apr;39(4):690-699. doi: 10.1089/dna.2019.5088. Epub 2020 Feb 6.
8
Construction of mRNA prognosis signature associated with differentially expressed genes in early stage of stomach adenocarcinomas based on TCGA and GEO datasets.基于 TCGA 和 GEO 数据集构建与胃腺癌早期差异表达基因相关的 mRNA 预后特征。
Eur J Med Res. 2022 Oct 17;27(1):205. doi: 10.1186/s40001-022-00827-4.
9
Identification and validation of a novel signature as a diagnostic and prognostic biomarker in colorectal cancer.鉴定和验证一种新型标志物作为结直肠癌的诊断和预后生物标志物。
Biol Direct. 2022 Nov 2;17(1):29. doi: 10.1186/s13062-022-00342-w.
10
Identification of a novel Immune-Related prognostic model for patients with colorectal cancer based on 3 subtypes.基于3种亚型的结直肠癌患者新型免疫相关预后模型的鉴定
Immunobiology. 2023 Mar;228(2):152352. doi: 10.1016/j.imbio.2023.152352. Epub 2023 Feb 14.

引用本文的文献

1
Cis-regulation analysis of RNA m6A methylation and gene expression in colorectal cancer.结直肠癌中RNA m6A甲基化与基因表达的顺式调控分析
Front Genet. 2025 Aug 14;16:1622957. doi: 10.3389/fgene.2025.1622957. eCollection 2025.
2
Identification of Neutrophil Extracellular Trap-Related Biomarkers in Diabetic Foot Ulcers Based on Bioinformatics.基于生物信息学的糖尿病足溃疡中性粒细胞胞外陷阱相关生物标志物的鉴定
J Inflamm Res. 2025 Aug 18;18:11355-11372. doi: 10.2147/JIR.S531204. eCollection 2025.
3
Machine learning-driven multi-targeted drug discovery in colon cancer using biomarker signatures.

本文引用的文献

1
Molecular features and gene expression signature of metastatic colorectal cancer (Review).转移性结直肠癌的分子特征和基因表达特征(综述)。
Oncol Rep. 2021 Apr;45(4). doi: 10.3892/or.2021.7961. Epub 2021 Mar 2.
2
Identification of an Immune-Related Gene Signature to Improve Prognosis Prediction in Colorectal Cancer Patients.用于改善结直肠癌患者预后预测的免疫相关基因特征的鉴定
Front Genet. 2020 Dec 4;11:607009. doi: 10.3389/fgene.2020.607009. eCollection 2020.
3
Identification of NEO1 as a prognostic biomarker and its effects on the progression of colorectal cancer.
基于生物标志物特征的机器学习驱动的结肠癌多靶点药物发现
NPJ Precis Oncol. 2025 Aug 22;9(1):297. doi: 10.1038/s41698-025-01058-6.
4
A novel sequence-based transformer model architecture for integrating multi-omics data in preterm birth risk prediction.一种用于早产风险预测中整合多组学数据的基于序列的新型变压器模型架构。
NPJ Digit Med. 2025 Aug 20;8(1):536. doi: 10.1038/s41746-025-01942-2.
5
Predicting emotional responses in interactive art using Random Forests: a model grounded in enactive aesthetics.使用随机森林预测互动艺术中的情感反应:一种基于生成美学的模型。
Front Psychol. 2025 Aug 4;16:1609103. doi: 10.3389/fpsyg.2025.1609103. eCollection 2025.
6
Screening biomarkers related to cholesterol metabolism in osteoarthritis based on transcriptomics.基于转录组学筛选骨关节炎中与胆固醇代谢相关的生物标志物
Sci Rep. 2025 Jul 1;15(1):21218. doi: 10.1038/s41598-025-04550-y.
7
Transcriptomics-based exploration of ubiquitination-related biomarkers and potential molecular mechanisms in laryngeal squamous cell carcinoma.基于转录组学对喉鳞状细胞癌中泛素化相关生物标志物及潜在分子机制的探索
BMC Med Genomics. 2025 May 12;18(1):84. doi: 10.1186/s12920-025-02148-x.
8
Machine learning-based characterization of stemness features and construction of a stemness subtype classifier for bladder cancer.基于机器学习的膀胱癌干性特征表征及干性亚型分类器的构建
BMC Cancer. 2025 Apr 17;25(1):717. doi: 10.1186/s12885-025-14109-9.
9
Uncovering Hippo pathway-related biomarkers in acute myocardial infarction via scRNA-seq binding transcriptomics.通过单细胞RNA测序结合转录组学揭示急性心肌梗死中与河马通路相关的生物标志物
Sci Rep. 2025 Mar 26;15(1):10368. doi: 10.1038/s41598-025-94820-6.
10
Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study.基于可解释机器学习的脓毒症重症监护病房患者28天死亡率预测:一项多中心回顾性研究
Front Cell Infect Microbiol. 2025 Jan 8;14:1500326. doi: 10.3389/fcimb.2024.1500326. eCollection 2024.
鉴定NEO1作为一种预后生物标志物及其对结直肠癌进展的影响。
Cancer Cell Int. 2020 Oct 17;20:510. doi: 10.1186/s12935-020-01604-1. eCollection 2020.
4
A Prognostic Model Based on the Immune-related Genes in Colon Adenocarcinoma.一种基于结肠腺癌免疫相关基因的预后模型。
Int J Med Sci. 2020 Jul 19;17(13):1879-1896. doi: 10.7150/ijms.45813. eCollection 2020.
5
Identification of Core Gene Expression Signature and Key Pathways in Colorectal Cancer.结直肠癌核心基因表达特征及关键通路的鉴定
Front Genet. 2020 Feb 21;11:45. doi: 10.3389/fgene.2020.00045. eCollection 2020.
6
Primary tumors release ITGBL1-rich extracellular vesicles to promote distal metastatic tumor growth through fibroblast-niche formation.原发肿瘤释放富含 ITGBL1 的细胞外囊泡,通过成纤维细胞龛形成促进远端转移瘤生长。
Nat Commun. 2020 Mar 5;11(1):1211. doi: 10.1038/s41467-020-14869-x.
7
Presenting symptoms of cancer and stage at diagnosis: evidence from a cross-sectional, population-based study.诊断时癌症的表现症状和分期:一项基于人群的横断面研究证据。
Lancet Oncol. 2020 Jan;21(1):73-79. doi: 10.1016/S1470-2045(19)30595-9. Epub 2019 Nov 6.
8
VSTM2A suppresses colorectal cancer and antagonizes Wnt signaling receptor LRP6.VSTM2A 抑制结直肠癌并拮抗 Wnt 信号受体 LRP6。
Theranostics. 2019 Aug 21;9(22):6517-6531. doi: 10.7150/thno.34989. eCollection 2019.
9
A prognostic index based on an eleven gene signature to predict systemic recurrences in colorectal cancer.基于十一基因特征的预后指数预测结直肠癌的全身复发。
Exp Mol Med. 2019 Oct 2;51(10):1-12. doi: 10.1038/s12276-019-0319-y.
10
GEPIA2: an enhanced web server for large-scale expression profiling and interactive analysis.GEPIA2:一个用于大规模表达谱分析和交互式分析的增强型网络服务器。
Nucleic Acids Res. 2019 Jul 2;47(W1):W556-W560. doi: 10.1093/nar/gkz430.