• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

半监督学习提高了基于基因表达的癌症复发预测。

Semi-supervised learning improves gene expression-based prediction of cancer recurrence.

机构信息

Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA.

出版信息

Bioinformatics. 2011 Nov 1;27(21):3017-23. doi: 10.1093/bioinformatics/btr502. Epub 2011 Sep 4.

DOI:10.1093/bioinformatics/btr502
PMID:21893520
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3198572/
Abstract

MOTIVATION

Gene expression profiling has shown great potential in outcome prediction for different types of cancers. Nevertheless, small sample size remains a bottleneck in obtaining robust and accurate classifiers. Traditional supervised learning techniques can only work with labeled data. Consequently, a large number of microarray data that do not have sufficient follow-up information are disregarded. To fully leverage all of the precious data in public databases, we turned to a semi-supervised learning technique, low density separation (LDS).

RESULTS

Using a clinically important question of predicting recurrence risk in colorectal cancer patients, we demonstrated that (i) semi-supervised classification improved prediction accuracy as compared with the state of the art supervised method SVM, (ii) performance gain increased with the number of unlabeled samples, (iii) unlabeled data from different institutes could be employed after appropriate processing and (iv) the LDS method is robust with regard to the number of input features. To test the general applicability of this semi-supervised method, we further applied LDS on human breast cancer datasets and also observed superior performance. Our results demonstrated great potential of semi-supervised learning in gene expression-based outcome prediction for cancer patients.

CONTACT

bing.zhang@vanderbilt.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因表达谱分析在不同类型癌症的预后预测方面显示出巨大的潜力。然而,小样本量仍然是获得稳健和准确分类器的瓶颈。传统的监督学习技术只能处理标记数据。因此,大量没有足够随访信息的微阵列数据被忽略了。为了充分利用公共数据库中的所有宝贵数据,我们转向了一种半监督学习技术,低密度分离(LDS)。

结果

我们使用一个临床重要的问题,即预测结直肠癌患者的复发风险,证明了(i)半监督分类与最先进的监督方法 SVM 相比提高了预测准确性,(ii)性能增益随着未标记样本数量的增加而增加,(iii)经过适当处理后,可以使用来自不同机构的未标记数据,以及(iv)LDS 方法对于输入特征的数量具有鲁棒性。为了测试这种半监督方法的通用性,我们进一步将 LDS 应用于人类乳腺癌数据集,也观察到了优越的性能。我们的结果表明,半监督学习在癌症患者基于基因表达的预后预测方面具有巨大的潜力。

联系方式

bing.zhang@vanderbilt.edu.

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
Semi-supervised learning improves gene expression-based prediction of cancer recurrence.半监督学习提高了基于基因表达的癌症复发预测。
Bioinformatics. 2011 Nov 1;27(21):3017-23. doi: 10.1093/bioinformatics/btr502. Epub 2011 Sep 4.
2
Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.使用半监督学习构建整合基因网络以分析癌症复发
PLoS One. 2014 Jan 31;9(1):e86309. doi: 10.1371/journal.pone.0086309. eCollection 2014.
3
Prognostic outcome prediction by semi-supervised least squares classification.基于半监督最小二乘法分类的预后结局预测。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa249.
4
Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。
BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.
5
Semi-supervised clinical text classification with Laplacian SVMs: an application to cancer case management.基于拉普拉斯支持向量机的半监督临床文本分类:在癌症病例管理中的应用。
J Biomed Inform. 2013 Oct;46(5):869-75. doi: 10.1016/j.jbi.2013.06.014. Epub 2013 Jul 8.
6
Feature specific quantile normalization enables cross-platform classification of molecular subtypes using gene expression data.特征特异性分位数归一化可使用基因表达数据对分子亚型进行跨平台分类。
Bioinformatics. 2018 Jun 1;34(11):1868-1874. doi: 10.1093/bioinformatics/bty026.
7
Cancer survival analysis using semi-supervised learning method based on Cox and AFT models with L1/2 regularization.基于带有L1/2正则化的Cox模型和加速失效时间(AFT)模型的半监督学习方法进行癌症生存分析。
BMC Med Genomics. 2016 Mar 1;9:11. doi: 10.1186/s12920-016-0169-6.
8
Semi-supervised analysis of gene expression profiles for lineage-specific development in the Caenorhabditis elegans embryo.秀丽隐杆线虫胚胎中谱系特异性发育的基因表达谱半监督分析。
Bioinformatics. 2006 Jul 15;22(14):e417-23. doi: 10.1093/bioinformatics/btl256.
9
Gene-expression-based cancer subtypes prediction through feature selection and transductive SVM.基于基因表达的癌症亚型预测:特征选择与转导 SVM 方法
IEEE Trans Biomed Eng. 2013 Apr;60(4):1111-7. doi: 10.1109/TBME.2012.2225622. Epub 2012 Oct 18.
10
Multi-class motor imagery EEG classification using collaborative representation-based semi-supervised extreme learning machine.基于协同表示的半监督极限学习机的多类运动想象 EEG 分类。
Med Biol Eng Comput. 2020 Sep;58(9):2119-2130. doi: 10.1007/s11517-020-02227-4. Epub 2020 Jul 16.

引用本文的文献

1
Explainable machine learning approach for cancer prediction through binarilization of RNA sequencing data.基于 RNA 测序数据二值化的癌症预测可解释机器学习方法。
PLoS One. 2024 May 10;19(5):e0302947. doi: 10.1371/journal.pone.0302947. eCollection 2024.
2
U-Net Convolutional Neural Network for Real-Time Prediction of the Number of Cultured Corneal Endothelial Cells for Cellular Therapy.用于细胞治疗的培养角膜内皮细胞数量实时预测的U-Net卷积神经网络
Bioengineering (Basel). 2024 Jan 11;11(1):71. doi: 10.3390/bioengineering11010071.
3
Semi-supervised vision transformer with adaptive token sampling for breast cancer classification.用于乳腺癌分类的具有自适应令牌采样的半监督视觉Transformer
Front Pharmacol. 2022 Jul 22;13:929755. doi: 10.3389/fphar.2022.929755. eCollection 2022.
4
Semi-supervised learning in cancer diagnostics.癌症诊断中的半监督学习。
Front Oncol. 2022 Jul 14;12:960984. doi: 10.3389/fonc.2022.960984. eCollection 2022.
5
Incorporating Omics Data in Genomic Prediction.将组学数据纳入基因组预测
Methods Mol Biol. 2022;2467:341-357. doi: 10.1007/978-1-0716-2205-6_12.
6
Identification of Signature Genes and Construction of an Artificial Neural Network Model of Prostate Cancer.前列腺癌特征基因的鉴定及人工神经网络模型的构建
J Healthc Eng. 2022 Apr 7;2022:1562511. doi: 10.1155/2022/1562511. eCollection 2022.
7
Challenges in translational machine learning.转化机器学习中的挑战。
Hum Genet. 2022 Sep;141(9):1451-1466. doi: 10.1007/s00439-022-02439-8. Epub 2022 Mar 4.
8
Theranostic Interpolation of Genomic Instability in Breast Cancer.乳腺癌基因组不稳定性的诊疗插值法
Int J Mol Sci. 2022 Feb 7;23(3):1861. doi: 10.3390/ijms23031861.
9
Development of artificial intelligence technology in diagnosis, treatment, and prognosis of colorectal cancer.人工智能技术在结直肠癌诊断、治疗及预后方面的发展
World J Gastrointest Oncol. 2022 Jan 15;14(1):124-152. doi: 10.4251/wjgo.v14.i1.124.
10
Integrated Analysis of Whole Genome and Epigenome Data Using Machine Learning Technology: Toward the Establishment of Precision Oncology.使用机器学习技术对全基因组和表观基因组数据进行综合分析:迈向精准肿瘤学的确立。
Front Oncol. 2021 May 12;11:666937. doi: 10.3389/fonc.2021.666937. eCollection 2021.

本文引用的文献

1
Association of a leukemic stem cell gene expression signature with clinical outcomes in acute myeloid leukemia.急性髓系白血病中白血病干细胞基因表达特征与临床结局的关联。
JAMA. 2010 Dec 22;304(24):2706-15. doi: 10.1001/jama.2010.1862.
2
Gene expression signature to improve prognosis prediction of stage II and III colorectal cancer.用于改善 II 期和 III 期结直肠癌预后预测的基因表达谱。
J Clin Oncol. 2011 Jan 1;29(1):17-24. doi: 10.1200/JCO.2010.30.1077. Epub 2010 Nov 22.
3
Semi-supervised classification via local spline regression.基于局部样条回归的半监督分类。
IEEE Trans Pattern Anal Mach Intell. 2010 Nov;32(11):2039-53. doi: 10.1109/TPAMI.2010.35.
4
Semi-supervised recursively partitioned mixture models for identifying cancer subtypes.半监督递归分区混合模型用于识别癌症亚型。
Bioinformatics. 2010 Oct 15;26(20):2578-85. doi: 10.1093/bioinformatics/btq470. Epub 2010 Aug 16.
5
A six-gene signature predicts survival of patients with localized pancreatic ductal adenocarcinoma.一个六基因标志物可预测局限性胰腺导管腺癌患者的生存情况。
PLoS Med. 2010 Jul 13;7(7):e1000307. doi: 10.1371/journal.pmed.1000307.
6
Validated prediction of clinical outcome in sarcomas and multiple types of cancer on the basis of a gene expression signature related to genome complexity.基于与基因组复杂性相关的基因表达特征,对肉瘤和多种类型癌症的临床结果进行了验证性预测。
Nat Med. 2010 Jul;16(7):781-7. doi: 10.1038/nm.2174. Epub 2010 Jun 27.
7
Discriminative semi-supervised feature selection via manifold regularization.基于流形正则化的判别式半监督特征选择
IEEE Trans Neural Netw. 2010 Jul;21(7):1033-47. doi: 10.1109/TNN.2010.2047114. Epub 2010 Jun 21.
8
Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions.基于多种半监督假设的正则化提升的半监督学习。
IEEE Trans Pattern Anal Mach Intell. 2011 Jan;33(1):129-43. doi: 10.1109/TPAMI.2010.92.
9
A gene signature predictive for outcome in advanced ovarian cancer identifies a survival factor: microfibril-associated glycoprotein 2.一种预测晚期卵巢癌预后的基因特征鉴定出一种生存因子:微原纤维相关糖蛋白2。
Cancer Cell. 2009 Dec 8;16(6):521-32. doi: 10.1016/j.ccr.2009.10.018.
10
Gene expression profiles as predictors of poor outcomes in stage II colorectal cancer: A systematic review and meta-analysis.基因表达谱作为II期结直肠癌预后不良的预测指标:一项系统评价和荟萃分析。
Clin Colorectal Cancer. 2009 Oct;8(4):207-14. doi: 10.3816/CCC.2009.n.035.