• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CCPred:使用机器学习技术在不同分子水平上进行全球和人群特异性结直肠癌预测以及宏基因组生物标志物鉴定。

CCPred: Global and population-specific colorectal cancer prediction and metagenomic biomarker identification at different molecular levels using machine learning techniques.

机构信息

Department of Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, 38080, Turkey.

Department of Electrical and Computer Engineering, Faculty of Engineering, Abdullah Gul University, Kayseri, 38080, Turkey.

出版信息

Comput Biol Med. 2024 Nov;182:109098. doi: 10.1016/j.compbiomed.2024.109098. Epub 2024 Sep 17.

DOI:10.1016/j.compbiomed.2024.109098
PMID:39293338
Abstract

Colorectal cancer (CRC) ranks as the third most common cancer globally and the second leading cause of cancer-related deaths. Recent research highlights the pivotal role of the gut microbiota in CRC development and progression. Understanding the complex interplay between disease development and metagenomic data is essential for CRC diagnosis and treatment. Current computational models employ machine learning to identify metagenomic biomarkers associated with CRC, yet there is a need to improve their accuracy through a holistic biological knowledge perspective. This study aims to evaluate CRC-associated metagenomic data at species, enzymes, and pathway levels via conducting global and population-specific analyses. These analyses utilize relative abundance values from human gut microbiome sequencing data and robust classification models are built for disease prediction and biomarker identification. For global CRC prediction and biomarker identification, the features that are identified by SelectKBest (SKB), Information Gain (IG), and Extreme Gradient Boosting (XGBoost) methods are combined. Population-based analysis includes within-population, leave-one-dataset-out (LODO) and cross-population approaches. Four classification algorithms are employed for CRC classification. Random Forest achieved an AUC of 0.83 for species data, 0.78 for enzyme data and 0.76 for pathway data globally. On the global scale, potential taxonomic biomarkers include ruthenibacterium lactatiformanas; enzyme biomarkers include RNA 2' 3' cyclic 3' phosphodiesterase; and pathway biomarkers include pyruvate fermentation to acetone pathway. This study underscores the potential of machine learning models trained on metagenomic data for improved disease prediction and biomarker discovery. The proposed model and associated files are available at https://github.com/TemizMus/CCPRED.

摘要

结直肠癌(CRC)是全球第三大常见癌症,也是癌症相关死亡的第二大主要原因。最近的研究强调了肠道微生物群在 CRC 发展和进展中的关键作用。了解疾病发展和宏基因组数据之间的复杂相互作用对于 CRC 的诊断和治疗至关重要。目前的计算模型使用机器学习来识别与 CRC 相关的宏基因组生物标志物,但需要从整体生物学知识的角度来提高它们的准确性。本研究旨在通过进行全球和特定人群的分析,评估 CRC 相关的宏基因组数据在物种、酶和途径水平上的特征。这些分析利用人类肠道微生物组测序数据的相对丰度值,并为疾病预测和生物标志物识别构建稳健的分类模型。对于全球 CRC 预测和生物标志物识别,通过 SelectKBest(SKB)、信息增益(IG)和极端梯度提升(XGBoost)方法确定的特征进行组合。基于人群的分析包括人群内、离开一个数据集(LODO)和跨人群方法。四种分类算法用于 CRC 分类。随机森林在物种数据、酶数据和途径数据方面的 AUC 分别为 0.83、0.78 和 0.76。在全球范围内,潜在的分类生物标志物包括 Ruthenibacterium lactatiformans;酶生物标志物包括 RNA 2' 3' 环 3' 磷酸二酯酶;途径生物标志物包括丙酮酸盐发酵途径。本研究强调了基于宏基因组数据训练的机器学习模型在改善疾病预测和生物标志物发现方面的潜力。该模型和相关文件可在 https://github.com/TemizMus/CCPRED 上获取。

相似文献

1
CCPred: Global and population-specific colorectal cancer prediction and metagenomic biomarker identification at different molecular levels using machine learning techniques.CCPred:使用机器学习技术在不同分子水平上进行全球和人群特异性结直肠癌预测以及宏基因组生物标志物鉴定。
Comput Biol Med. 2024 Nov;182:109098. doi: 10.1016/j.compbiomed.2024.109098. Epub 2024 Sep 17.
2
Using gut microbiota as a diagnostic tool for colorectal cancer: machine learning techniques reveal promising results.利用肠道微生物群作为结直肠癌的诊断工具:机器学习技术显示出有希望的结果。
J Med Microbiol. 2023 Jun;72(6). doi: 10.1099/jmm.0.001699.
3
Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.大型宏基因组数据集的机器学习荟萃分析:工具与生物学见解
PLoS Comput Biol. 2016 Jul 11;12(7):e1004977. doi: 10.1371/journal.pcbi.1004977. eCollection 2016 Jul.
4
Robust prediction of colorectal cancer via gut microbiome 16S rRNA sequencing data.通过肠道微生物组 16S rRNA 测序数据进行稳健的结直肠癌预测。
J Med Microbiol. 2024 Oct;73(10). doi: 10.1099/jmm.0.001903.
5
microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.微生物群落GSM:使用分组、评分和建模(G-S-M)方法从宏基因组数据中识别分类学生物标志物。
Front Microbiol. 2023 Nov 22;14:1264941. doi: 10.3389/fmicb.2023.1264941. eCollection 2023.
6
Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods.基于不同特征选择方法筛选出的人类肠道微生物组炎症性肠病生物标志物。
PeerJ. 2022 Apr 25;10:e13205. doi: 10.7717/peerj.13205. eCollection 2022.
7
Comparison between 16S rRNA and shotgun sequencing in colorectal cancer, advanced colorectal lesions, and healthy human gut microbiota.16S rRNA 与 shotgun 测序在结直肠癌、晚期结直肠病变和健康人肠道微生物群中的比较。
BMC Genomics. 2024 Jul 29;25(1):730. doi: 10.1186/s12864-024-10621-7.
8
Automatic disease prediction from human gut metagenomic data using boosting GraphSAGE.基于提升图抽样的人类肠道宏基因组数据自动疾病预测。
BMC Bioinformatics. 2023 Mar 31;24(1):126. doi: 10.1186/s12859-023-05251-x.
9
Towards a metagenomics machine learning interpretable model for understanding the transition from adenoma to colorectal cancer.为了理解从腺瘤到结直肠癌的转变,建立一个可解释的宏基因组机器学习模型。
Sci Rep. 2022 Jan 10;12(1):450. doi: 10.1038/s41598-021-04182-y.
10
Gut Microbiome and colorectal cancer: discovery of bacterial changes with metagenomics application in Turkısh population.肠道微生物组与结直肠癌:在土耳其人群中应用宏基因组学发现细菌变化。
Genes Genomics. 2024 Sep;46(9):1059-1070. doi: 10.1007/s13258-024-01538-2. Epub 2024 Jul 11.