文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

联合机器学习和统计 R 分析的转录组谱分析鉴定 TMEM236 为结直肠癌的潜在新型诊断生物标志物。

Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer.

机构信息

Department of Biotechnology, Motilal Nehru National Institute of Technology Allahabad, Prayagraj, 211004, India.

National Institute of Animal Biotechnology, Hyderabad, 500032, India.

出版信息

Sci Rep. 2021 Jul 12;11(1):14304. doi: 10.1038/s41598-021-92692-0.


DOI:10.1038/s41598-021-92692-0
PMID:34253750
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8275802/
Abstract

Colorectal cancer (CRC) is a common cause of cancer-related deaths worldwide. The CRC mRNA gene expression dataset containing 644 CRC tumor and 51 normal samples from the cancer genome atlas (TCGA) was pre-processed to identify the significant differentially expressed genes (DEGs). Feature selection techniques Least absolute shrinkage and selection operator (LASSO) and Relief were used along with class balancing for obtaining features (genes) of high importance. The classification of the CRC dataset was done by ML algorithms namely, random forest (RF), K-nearest neighbour (KNN), and artificial neural networks (ANN). The significant DEGs were 2933, having 1832 upregulated and 1101 downregulated genes. The CRC gene expression dataset had 23,186 features. LASSO had performed better than Relief for classifying tumor and normal samples through ML algorithms namely RF, KNN, and ANN with an accuracy of 100%, while Relief had given 79.5%, 85.05%, and 100% respectively. Common features between LASSO and DEGs were 38, from them only 5 common genes namely, VSTM2A, NR5A2, TMEM236, GDLN, and ETFDH had shown statistically significant survival analysis. Functional review and analysis of the selected genes helped in downsizing the 5 genes to 2, which are VSTM2A and TMEM236. Differential expression of TMEM236 was statistically significant and was markedly reduced in the dataset which solicits appreciation for assessment as a novel biomarker for CRC diagnosis.

摘要

结直肠癌(CRC)是全球癌症相关死亡的常见原因。从癌症基因组图谱(TCGA)中预处理了包含 644 个 CRC 肿瘤和 51 个正常样本的 CRC mRNA 基因表达数据集,以鉴定显著差异表达基因(DEG)。使用最小绝对收缩和选择算子(LASSO)和 Relief 特征选择技术以及类别平衡来获得具有重要性的特征(基因)。使用机器学习算法(即随机森林(RF)、K 最近邻(KNN)和人工神经网络(ANN))对 CRC 数据集进行分类。显著的 DEG 有 2933 个,其中有 1832 个上调和 1101 个下调基因。CRC 基因表达数据集有 23186 个特征。通过机器学习算法(即 RF、KNN 和 ANN),LASSO 比 Relief 更能准确地对肿瘤和正常样本进行分类,准确率为 100%,而 Relief 的准确率分别为 79.5%、85.05%和 100%。LASSO 和 DEG 之间的共有特征为 38 个,其中只有 5 个共有基因,即 VSTM2A、NR5A2、TMEM236、GDLN 和 ETFDH,它们的生存分析显示具有统计学意义。对选定基因的功能综述和分析有助于将这 5 个基因缩小到 2 个,即 VSTM2A 和 TMEM236。TMEM236 的差异表达具有统计学意义,并且在数据集中明显减少,这引起了对其作为 CRC 诊断新生物标志物评估的赞赏。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/c326c19b5469/41598_2021_92692_Fig8a_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/e67e85c4f657/41598_2021_92692_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/78e8aa789095/41598_2021_92692_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/dfcbf39f5d43/41598_2021_92692_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/c7790f6094bc/41598_2021_92692_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/11d3f4948f19/41598_2021_92692_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/dd29de4356d8/41598_2021_92692_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/a5d1bdcbbdd2/41598_2021_92692_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/c326c19b5469/41598_2021_92692_Fig8a_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/e67e85c4f657/41598_2021_92692_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/78e8aa789095/41598_2021_92692_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/dfcbf39f5d43/41598_2021_92692_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/c7790f6094bc/41598_2021_92692_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/11d3f4948f19/41598_2021_92692_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/dd29de4356d8/41598_2021_92692_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/a5d1bdcbbdd2/41598_2021_92692_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3eb1/8275802/c326c19b5469/41598_2021_92692_Fig8a_HTML.jpg

相似文献

[1]
Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer.

Sci Rep. 2021-7-12

[2]
Integrative Gene Expression Profiling Analysis to Investigate Potential Prognostic Biomarkers for Colorectal Cancer.

Med Sci Monit. 2020-1-1

[3]
A stacking ensemble deep learning approach to cancer type classification based on TCGA data.

Sci Rep. 2021-8-2

[4]
Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta.

Sci Rep. 2023-4-19

[5]
Comparison of ischemic stroke diagnosis models based on machine learning.

Front Neurol. 2022-12-5

[6]
High-Throughput Omics and Statistical Learning Integration for the Discovery and Validation of Novel Diagnostic Signatures in Colorectal Cancer.

Int J Mol Sci. 2019-1-12

[7]
Identification of Critical Genes and Five Prognostic Biomarkers Associated with Colorectal Cancer.

Med Sci Monit. 2018-7-5

[8]
Identification of Genes Related to Clinicopathological Characteristics and Prognosis of Patients with Colorectal Cancer.

DNA Cell Biol. 2020-2-6

[9]
Gene Selection for the Discrimination of Colorectal Cancer.

Curr Mol Med. 2020

[10]
Unlocking the Potential of the CA2, CA7, and ITM2C Gene Signatures for the Early Detection of Colorectal Cancer: A Comprehensive Analysis of RNA-Seq Data by Utilizing Machine Learning Algorithms.

Genes (Basel). 2023-9-22

引用本文的文献

[1]
Transcriptional patterns of cancer-related genes in primary and metastatic tumours revealed by machine learning.

BMC Biol. 2025-8-7

[2]
The diagnostic and prognostic value of in colorectal cancer.

Bioimpacts. 2024-11-5

[3]
A Narrative Review of Prognostic Gene Signatures in Oral Squamous Cell Carcinoma Using LASSO Cox Regression.

Biomedicines. 2025-1-8

[4]
Transcriptomic research in atherosclerosis: Unravelling plaque phenotype and overcoming methodological challenges.

J Mol Cell Cardiol Plus. 2023-9-12

[5]
Comprehensive bioinformatics and machine learning analyses for breast cancer staging using TCGA dataset.

Brief Bioinform. 2024-11-22

[6]
Identification of ribosome biogenesis genes and subgroups in ischaemic stroke.

Front Immunol. 2024

[7]
Combined High-Throughput Proteomics and Random Forest Machine-Learning Approach Differentiates and Classifies Metabolic, Immune, Signaling and ECM Intra-Tumor Heterogeneity of Colorectal Cancer.

Cells. 2024-8-6

[8]
Classification of Long Non-Coding RNAs s Between Early and Late Stage of Liver Cancers From Non-coding RNA Profiles Using Machine-Learning Approach.

Bioinform Biol Insights. 2024-6-5

[9]
New evidence: Metformin unsuitable as routine adjuvant for breast cancer: a drug-target mendelian randomization analysis.

BMC Cancer. 2024-6-6

[10]
From Data to Cure: A Comprehensive Exploration of Multi-omics Data Analysis for Targeted Therapies.

Mol Biotechnol. 2025-4

本文引用的文献

[1]
Mu Opioid Receptor 1 (MOR-1) Expression in Colorectal Cancer and Oncological Long-Term Outcomes: A Five-Year Retrospective Longitudinal Cohort Study.

Cancers (Basel). 2020-1-5

[2]
VSTM2A suppresses colorectal cancer and antagonizes Wnt signaling receptor LRP6.

Theranostics. 2019-8-21

[3]
Circulating sex hormone levels and colorectal cancer risk in Japanese postmenopausal women: The JPHC nested case-control study.

Int J Cancer. 2019-6-11

[4]
Identification of differentially expressed genes and biological characteristics of colorectal cancer by integrated bioinformatics analysis.

J Cell Physiol. 2019-9

[5]
Construction of an miRNA-mRNA regulatory network in colorectal cancer with bioinformatics methods.

Anticancer Drugs. 2019-7

[6]
TMEM Proteins in Cancer: A Review.

Front Pharmacol. 2018-12-6

[7]
Early-onset colorectal cancer in young individuals.

Mol Oncol. 2018-12-22

[8]
Colorectal cancer: genetic abnormalities, tumor progression, tumor heterogeneity, clonal evolution and tumor-initiating cells.

Med Sci (Basel). 2018-4-13

[9]
TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.

Nucleic Acids Res. 2016-5-5

[10]
A Prospective Evaluation of Endogenous Sex Hormone Levels and Colorectal Cancer Risk in Postmenopausal Women.

J Natl Cancer Inst. 2015-8-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索