• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器学习从基因表达数据预测肿瘤纯度。

Prediction of tumor purity from gene expression data using machine learning.

机构信息

School of Systems Biomedical Science, Soongsil University, Seoul, Korea.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab163.

DOI:10.1093/bib/bbab163
PMID:33954576
Abstract

MOTIVATION

Bulk tumor samples used for high-throughput molecular profiling are often an admixture of cancer cells and non-cancerous cells, which include immune and stromal cells. The mixed composition can confound the analysis and affect the biological interpretation of the results, and thus, accurate prediction of tumor purity is critical. Although several methods have been proposed to predict tumor purity using high-throughput molecular data, there has been no comprehensive study on machine learning-based methods for the estimation of tumor purity.

RESULTS

We applied various machine learning models to estimate tumor purity. Overall, the models predicted the tumor purity accurately and showed a high correlation with well-established gold standard methods. In addition, we identified a small group of genes and demonstrated that they could predict tumor purity well. Finally, we confirmed that these genes were mainly involved in the immune system.

AVAILABILITY

The machine learning models constructed for this study are available at https://github.com/BonilKoo/ML_purity.

摘要

动机

用于高通量分子分析的大量肿瘤样本通常是癌细胞和非癌细胞的混合物,其中包括免疫细胞和基质细胞。这种混合成分可能会混淆分析并影响结果的生物学解释,因此,准确预测肿瘤纯度至关重要。尽管已经提出了几种使用高通量分子数据预测肿瘤纯度的方法,但对于基于机器学习的肿瘤纯度估计方法尚未进行全面研究。

结果

我们应用了各种机器学习模型来估计肿瘤纯度。总体而言,这些模型能够准确地预测肿瘤纯度,并与成熟的金标准方法高度相关。此外,我们确定了一小部分基因,并证明它们可以很好地预测肿瘤纯度。最后,我们证实这些基因主要参与免疫系统。

可用性

为这项研究构建的机器学习模型可在 https://github.com/BonilKoo/ML_purity 上获得。

相似文献

1
Prediction of tumor purity from gene expression data using machine learning.利用机器学习从基因表达数据预测肿瘤纯度。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab163.
2
Putative biomarkers for predicting tumor sample purity based on gene expression data.基于基因表达数据预测肿瘤样本纯度的候选生物标志物。
BMC Genomics. 2019 Dec 27;20(1):1021. doi: 10.1186/s12864-019-6412-8.
3
PUREE: accurate pan-cancer tumor purity estimation from gene expression data.泥化:从基因表达数据中准确估计泛癌症肿瘤纯度。
Commun Biol. 2023 Apr 11;6(1):394. doi: 10.1038/s42003-023-04764-8.
4
RF_Purify: a novel tool for comprehensive analysis of tumor-purity in methylation array data based on random forest regression.RF_Purify:一种基于随机森林回归的甲基化阵列数据分析中肿瘤纯度综合分析的新工具。
BMC Bioinformatics. 2019 Aug 16;20(1):428. doi: 10.1186/s12859-019-3014-z.
5
Impact of Tumor Purity on Immune Gene Expression and Clustering Analyses across Multiple Cancer Types.肿瘤纯度对多种癌症类型免疫基因表达和聚类分析的影响。
Cancer Immunol Res. 2018 Jan;6(1):87-97. doi: 10.1158/2326-6066.CIR-17-0201. Epub 2017 Nov 15.
6
Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA.机器学习通过对血浆游离细胞 DNA 进行全基因组测序来检测早期结直肠癌。
BMC Cancer. 2019 Aug 23;19(1):832. doi: 10.1186/s12885-019-6003-8.
7
Towards multi-omics characterization of tumor heterogeneity: a comprehensive review of statistical and machine learning approaches.迈向肿瘤异质性的多组学特征分析:统计和机器学习方法的综合综述。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa188.
8
Integrated transcriptomic-genomic tool Texomer profiles cancer tissues.Texomer 综合转录组-基因组工具可对癌症组织进行分析。
Nat Methods. 2019 May;16(5):401-404. doi: 10.1038/s41592-019-0388-9. Epub 2019 Apr 15.
9
A new method for constructing tumor specific gene co-expression networks based on samples with tumor purity heterogeneity.基于肿瘤纯度异质性样本构建肿瘤特异性基因共表达网络的新方法。
Bioinformatics. 2018 Jul 1;34(13):i528-i536. doi: 10.1093/bioinformatics/bty280.
10
HelPredictor models single-cell transcriptome to predict human embryo lineage allocation.HelPredictor模型通过单细胞转录组预测人类胚胎谱系分配。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab196.

引用本文的文献

1
Comparative study of tools for copy number variation detection using next-generation sequencing data.使用下一代测序数据进行拷贝数变异检测工具的比较研究
Sci Rep. 2025 Jul 1;15(1):22145. doi: 10.1038/s41598-025-06527-3.
2
MGME1 associates with poor prognosis and is vital for cell proliferation in lower-grade glioma.MGME1 与不良预后相关,对低级别脑胶质瘤的细胞增殖至关重要。
Aging (Albany NY). 2023 May 8;15(9):3690-3714. doi: 10.18632/aging.204705.
3
ExosomePurity: tumour purity deconvolution in serum exosomes based on miRNA signatures.
外泌体纯度:基于 miRNA 特征的血清外泌体中肿瘤纯度的去卷积。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad119.
4
Assessment of MicroRNAs Associated with Tumor Purity by Random Forest Regression.通过随机森林回归评估与肿瘤纯度相关的微小RNA
Biology (Basel). 2022 May 21;11(5):787. doi: 10.3390/biology11050787.