• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种与结直肠腺瘤-癌序列相关的基于基因的机器学习分类器。

A Gene-Based Machine Learning Classifier Associated to the Colorectal Adenoma-Carcinoma Sequence.

作者信息

Lacalamita Antonio, Piccinno Emanuele, Scalavino Viviana, Bellotti Roberto, Giannelli Gianluigi, Serino Grazia

机构信息

National Institute of Gastroenterology "S. de Bellis", Research Hospital, Castellana Grotte, 70013 Bari, Italy.

Dipartimento Interateneo di Fisica, Università degli Studi di Bari Aldo Moro, 70126 Bari, Italy.

出版信息

Biomedicines. 2021 Dec 17;9(12):1937. doi: 10.3390/biomedicines9121937.

DOI:10.3390/biomedicines9121937
PMID:34944753
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8698794/
Abstract

Colorectal cancer (CRC) carcinogenesis is generally the result of the sequential mutation and deletion of various genes; this is known as the normal mucosa-adenoma-carcinoma sequence. The aim of this study was to develop a predictor-classifier during the "adenoma-carcinoma" sequence using microarray gene expression profiles of primary CRC, adenoma, and normal colon epithelial tissues. Four gene expression profiles from the Gene Expression Omnibus database, containing 465 samples (105 normal, 155 adenoma, and 205 CRC), were preprocessed to identify differentially expressed genes (DEGs) between adenoma tissue and primary CRC. The feature selection procedure, using the sequential Boruta algorithm and Stepwise Regression, determined 56 highly important genes. K-Means methods showed that, using the selected 56 DEGs, the three groups were clearly separate. The classification was performed with machine learning algorithms such as Linear Model (LM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Artificial Neural Network (ANN). The best classification method in terms of accuracy (88.06 ± 0.70) and AUC (92.04 ± 0.47) was k-NN. To confirm the relevance of the predictive models, we applied the four models on a validation cohort: the k-NN model remained the best model in terms of performance, with 91.11% accuracy. Among the 56 DEGs, we identified 17 genes with an ascending or descending trend through the normal mucosa-adenoma-carcinoma sequence. Moreover, using the survival information of the TCGA database, we selected six DEGs related to patient prognosis (SCARA5, PKIB, CWH43, TEX11, METTL7A, and VEGFA). The six-gene-based classifier described in the current study could be used as a potential biomarker for the early diagnosis of CRC.

摘要

结直肠癌(CRC)的致癌过程通常是多种基因依次发生突变和缺失的结果;这被称为正常黏膜-腺瘤-癌序列。本研究的目的是利用原发性CRC、腺瘤和正常结肠上皮组织的微阵列基因表达谱,在“腺瘤-癌”序列期间开发一种预测分类器。对来自基因表达综合数据库的四个基因表达谱进行了预处理,这些谱包含465个样本(105个正常样本、155个腺瘤样本和205个CRC样本),以识别腺瘤组织和原发性CRC之间的差异表达基因(DEG)。使用顺序博鲁塔算法和逐步回归的特征选择程序确定了56个高度重要的基因。K均值方法表明,使用选定的56个DEG,三组明显分开。使用线性模型(LM)、随机森林(RF)、k近邻(k-NN)和人工神经网络(ANN)等机器学习算法进行分类。就准确率(88.06±0.70)和AUC(92.04±0.47)而言,最佳分类方法是k-NN。为了确认预测模型的相关性,我们在一个验证队列中应用了这四个模型:就性能而言,k-NN模型仍然是最佳模型,准确率为91.11%。在56个DEG中,我们通过正常黏膜-腺瘤-癌序列鉴定出17个呈上升或下降趋势的基因。此外,利用TCGA数据库的生存信息,我们选择了六个与患者预后相关的DEG(SCARA5、PKIB、CWH43、TEX11、METTL7A和VEGFA)。本研究中描述的基于六个基因的分类器可作为CRC早期诊断的潜在生物标志物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/0d26c47009ea/biomedicines-09-01937-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/e9b04fec289d/biomedicines-09-01937-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/e6b06faf79a9/biomedicines-09-01937-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/b0b48ce1cb65/biomedicines-09-01937-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/b81b5c672d6d/biomedicines-09-01937-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/d1ffb866465e/biomedicines-09-01937-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/0d26c47009ea/biomedicines-09-01937-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/e9b04fec289d/biomedicines-09-01937-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/e6b06faf79a9/biomedicines-09-01937-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/b0b48ce1cb65/biomedicines-09-01937-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/b81b5c672d6d/biomedicines-09-01937-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/d1ffb866465e/biomedicines-09-01937-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4859/8698794/0d26c47009ea/biomedicines-09-01937-g006.jpg

相似文献

1
A Gene-Based Machine Learning Classifier Associated to the Colorectal Adenoma-Carcinoma Sequence.一种与结直肠腺瘤-癌序列相关的基于基因的机器学习分类器。
Biomedicines. 2021 Dec 17;9(12):1937. doi: 10.3390/biomedicines9121937.
2
Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer.联合机器学习和统计 R 分析的转录组谱分析鉴定 TMEM236 为结直肠癌的潜在新型诊断生物标志物。
Sci Rep. 2021 Jul 12;11(1):14304. doi: 10.1038/s41598-021-92692-0.
3
Identification of potential biomarkers with colorectal cancer based on bioinformatics analysis and machine learning.基于生物信息学分析和机器学习的结直肠癌潜在生物标志物的鉴定。
Math Biosci Eng. 2021 Oct 19;18(6):8997-9015. doi: 10.3934/mbe.2021443.
4
Patterns of Gene Expression Profiles Associated with Colorectal Cancer in Colorectal Mucosa by Using Machine Learning Methods.利用机器学习方法分析结直肠黏膜中与结直肠癌相关的基因表达谱模式。
Comb Chem High Throughput Screen. 2024;27(19):2921-2934. doi: 10.2174/0113862073266300231026103844.
5
Identification of a 13-gene-based classifier as a potential biomarker to predict the effects of fluorouracil-based chemotherapy in colorectal cancer.鉴定一种基于13个基因的分类器作为预测氟尿嘧啶类化疗对结直肠癌疗效的潜在生物标志物。
Oncol Lett. 2019 Jun;17(6):5057-5063. doi: 10.3892/ol.2019.10159. Epub 2019 Mar 19.
6
Machine Learning-Based Identification of Colon Cancer Candidate Diagnostics Genes.基于机器学习的结肠癌候选诊断基因识别
Biology (Basel). 2022 Feb 25;11(3):365. doi: 10.3390/biology11030365.
7
Identification of key genes associated with colorectal cancer based on the transcriptional network.基于转录网络鉴定与结直肠癌相关的关键基因
Pathol Oncol Res. 2015 Jul;21(3):719-25. doi: 10.1007/s12253-014-9880-9. Epub 2015 Jan 23.
8
Transcriptomic Analyses of the Adenoma-Carcinoma Sequence Identify Hallmarks Associated With the Onset of Colorectal Cancer.腺瘤-癌序列的转录组分析确定了与结直肠癌发病相关的特征。
Front Oncol. 2021 Aug 11;11:704531. doi: 10.3389/fonc.2021.704531. eCollection 2021.
9
Predicting Colorectal Cancer Recurrence and Patient Survival Using Supervised Machine Learning Approach: A South African Population-Based Study.使用监督机器学习方法预测结直肠癌复发和患者生存:一项南非基于人群的研究。
Front Public Health. 2021 Jul 7;9:694306. doi: 10.3389/fpubh.2021.694306. eCollection 2021.
10
Combining the Fecal Immunochemical Test with a Logistic Regression Model for Screening Colorectal Neoplasia.将粪便免疫化学检测与逻辑回归模型相结合用于筛查结直肠肿瘤。
Front Pharmacol. 2021 Mar 17;12:635481. doi: 10.3389/fphar.2021.635481. eCollection 2021.

引用本文的文献

1
Machine learning-driven multi-targeted drug discovery in colon cancer using biomarker signatures.基于生物标志物特征的机器学习驱动的结肠癌多靶点药物发现
NPJ Precis Oncol. 2025 Aug 22;9(1):297. doi: 10.1038/s41698-025-01058-6.
2
Copper Metabolism-Related Genes as Biomarkers in Colon Adenoma and Cancer.铜代谢相关基因作为结肠腺瘤和癌症的生物标志物
Int J Gen Med. 2025 Jun 10;18:3021-3043. doi: 10.2147/IJGM.S521512. eCollection 2025.
3
PKIB, a Novel Target for Cancer Therapy.PKIB,癌症治疗的新靶点。

本文引用的文献

1
Identification of potential biomarkers and metabolic pathways based on integration of metabolomic and transcriptomic data in the development of breast cancer.基于代谢组学和转录组学数据整合在乳腺癌发生发展中鉴定潜在生物标志物和代谢途径
Arch Gynecol Obstet. 2021 Jun;303(6):1599-1606. doi: 10.1007/s00404-021-06015-9. Epub 2021 Mar 31.
2
Multi-Approach Bioinformatics Analysis of Curated Omics Data Provides a Gene Expression Panorama for Multiple Cancer Types.对经过整理的组学数据进行多方法生物信息学分析,可为多种癌症类型提供基因表达全景图。
Front Genet. 2020 Nov 23;11:586602. doi: 10.3389/fgene.2020.586602. eCollection 2020.
3
Int J Mol Sci. 2024 Apr 25;25(9):4664. doi: 10.3390/ijms25094664.
4
Elucidating immunological characteristics of the adenoma-carcinoma sequence in colorectal cancer patients in South Korea using a bioinformatics approach.利用生物信息学方法阐明韩国结直肠癌患者腺瘤-癌序列的免疫学特征。
Sci Rep. 2024 May 2;14(1):10105. doi: 10.1038/s41598-024-56078-2.
5
Artificial Intelligence and Complex Network Approaches Reveal Potential Gene Biomarkers for Hepatocellular Carcinoma.人工智能和复杂网络方法揭示了肝细胞癌的潜在基因生物标志物。
Int J Mol Sci. 2023 Oct 18;24(20):15286. doi: 10.3390/ijms242015286.
6
CWH43 Is a Novel Tumor Suppressor Gene with Negative Regulation of TTK in Colorectal Cancer.CWH43 是一种新型的结直肠癌肿瘤抑制基因,对 TTK 具有负调控作用。
Int J Mol Sci. 2023 Oct 17;24(20):15262. doi: 10.3390/ijms242015262.
7
Transcriptomic characterization revealed that METTL7A inhibits melanoma progression via the p53 signaling pathway and immunomodulatory pathway.转录组特征分析表明,METTL7A 通过 p53 信号通路和免疫调节通路抑制黑色素瘤进展。
PeerJ. 2023 Aug 2;11:e15799. doi: 10.7717/peerj.15799. eCollection 2023.
8
Bioinformatic analysis of hub markers and immune cell infiltration characteristics of gastric cancer.胃癌关键标志物和免疫细胞浸润特征的生物信息学分析。
Front Immunol. 2023 Jun 9;14:1202529. doi: 10.3389/fimmu.2023.1202529. eCollection 2023.
SCARA5 is a Novel Biomarker in Colorectal Cancer by Comprehensive Analysis.
通过综合分析发现SCARA5是结直肠癌中的一种新型生物标志物。
Clin Lab. 2020 Jul 1;66(7). doi: 10.7754/Clin.Lab.2019.191015.
4
Quantitative proteomic analysis identifies novel regulators of methotrexate resistance in choriocarcinoma.定量蛋白质组学分析鉴定绒毛膜癌中甲氨蝶呤耐药的新型调节因子。
Gynecol Oncol. 2020 Apr;157(1):268-279. doi: 10.1016/j.ygyno.2020.01.013. Epub 2020 Jan 17.
5
Genome-wide expression profiling in colorectal cancer focusing on lncRNAs in the adenoma-carcinoma transition.结直肠癌中 lncRNAs 在腺瘤-癌转变过程中的全基因组表达谱分析。
BMC Cancer. 2019 Nov 6;19(1):1059. doi: 10.1186/s12885-019-6180-5.
6
Global burden of colorectal cancer: emerging trends, risk factors and prevention strategies.全球结直肠癌负担:趋势、风险因素和预防策略。
Nat Rev Gastroenterol Hepatol. 2019 Dec;16(12):713-732. doi: 10.1038/s41575-019-0189-8. Epub 2019 Aug 27.
7
Transcriptomic Differences between Primary Colorectal Adenocarcinomas and Distant Metastases Reveal Metastatic Colorectal Cancer Subtypes.原发结直肠腺癌与远处转移之间的转录组差异揭示转移性结直肠癌亚型。
Cancer Res. 2019 Aug 15;79(16):4227-4241. doi: 10.1158/0008-5472.CAN-18-3945. Epub 2019 Jun 25.
8
g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update).g:Profiler:一个用于功能富集分析和基因列表转换的网络服务器(2019 更新)。
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198. doi: 10.1093/nar/gkz369.
9
Performance Characteristics of Fecal Immunochemical Tests for Colorectal Cancer and Advanced Adenomatous Polyps: A Systematic Review and Meta-analysis.粪便免疫化学试验检测结直肠癌和高级腺瘤的性能特征:系统评价和荟萃分析。
Ann Intern Med. 2019 Mar 5;170(5):319-329. doi: 10.7326/M18-2390. Epub 2019 Feb 26.
10
The molecular characteristics of colorectal cancer: Implications for diagnosis and therapy.结直肠癌的分子特征:对诊断和治疗的启示。
Oncol Lett. 2018 Jul;16(1):9-18. doi: 10.3892/ol.2018.8679. Epub 2018 May 9.