Suppr超能文献

从少量医学数据中学习——基于贝叶斯变分自编码器的稳健半监督癌症预后分类器

Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder.

作者信息

Hsu Te-Cheng, Lin Che

机构信息

Institute of Communications Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan.

Graduate Institute of Communication Engineering, National Taiwan University, Taipei 10617, Taiwan.

出版信息

Bioinform Adv. 2023 Jan 9;3(1):vbac100. doi: 10.1093/bioadv/vbac100. eCollection 2023.

Abstract

MOTIVATION

Cancer is one of the world's leading mortality causes, and its prognosis is hard to predict due to complicated biological interactions among heterogeneous data types. Numerous challenges, such as censorship, high dimensionality and small sample size, prevent researchers from using deep learning models for precise prediction.

RESULTS

We propose a robust Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder () as a structured machine-learning framework for cancer prognosis prediction. incorporates semi-supervised learning for predicting 5-year disease-specific survival and overall survival in breast and non-small cell lung cancer (NSCLC) patients, respectively. achieved significantly better AUROC scores than all existing benchmarks (81.73% for breast cancer; 80.46% for NSCLC), including our previously proposed bimodal neural network classifiers (77.71% for breast cancer; 78.67% for NSCLC). Independent validation results showed that still achieved better AUROC scores (74.74% for breast; 72.80% for NSCLC) than the bimodal neural network classifiers (64.13% for breast; 67.07% for NSCLC). is general and can potentially be trained on more patient data. This paves the foundation for personalized medicine for early cancer risk screening.

AVAILABILITY AND IMPLEMENTATION

The source codes reproducing the main results are available on GitHub: https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

癌症是全球主要的死亡原因之一,由于异质数据类型之间复杂的生物相互作用,其预后难以预测。诸多挑战,如删失、高维度和小样本量,阻碍了研究人员使用深度学习模型进行精确预测。

结果

我们提出了一种稳健的半监督癌症预后分类器,即带有贝叶斯变分自编码器的(),作为用于癌症预后预测的结构化机器学习框架。分别将半监督学习纳入预测乳腺癌和非小细胞肺癌(NSCLC)患者的5年疾病特异性生存率和总生存率。与所有现有基准相比,取得了显著更高的AUROC分数(乳腺癌为81.73%;NSCLC为80.46%),包括我们之前提出的双峰神经网络分类器(乳腺癌为77.71%;NSCLC为78.67%)。独立验证结果表明,与双峰神经网络分类器(乳腺癌为64.13%;NSCLC为67.07%)相比,仍取得了更好的AUROC分数(乳腺癌为74.74%;NSCLC为72.80%)。具有通用性,有可能在更多患者数据上进行训练。这为早期癌症风险筛查的个性化医疗奠定了基础。

可用性和实现

重现主要结果的源代码可在GitHub上获取:https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/。

补充信息

补充数据可在在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc94/9832968/e147697f0412/vbac100f1.jpg

相似文献

1
Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder.
Bioinform Adv. 2023 Jan 9;3(1):vbac100. doi: 10.1093/bioadv/vbac100. eCollection 2023.
2
Prediction of mutation effects using a deep temporal convolutional network.
Bioinformatics. 2020 Apr 1;36(7):2047-2052. doi: 10.1093/bioinformatics/btz873.
3
Prognostic outcome prediction by semi-supervised least squares classification.
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa249.
4
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
5
Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction.
Bioinformatics. 2021 Oct 25;37(20):3428-3435. doi: 10.1093/bioinformatics/btab346.
6
Generative Adversarial Networks for Robust Breast Cancer Prognosis Prediction with Limited Data Size.
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5669-5672. doi: 10.1109/EMBC44109.2020.9175736.
7
Semi-supervised automatic seizure detection using personalized anomaly detecting variational autoencoder with behind-the-ear EEG.
Comput Methods Programs Biomed. 2022 Jan;213:106542. doi: 10.1016/j.cmpb.2021.106542. Epub 2021 Nov 17.
8
MetaKTSP: a meta-analytic top scoring pair method for robust cross-study validation of omics prediction analysis.
Bioinformatics. 2016 Jul 1;32(13):1966-73. doi: 10.1093/bioinformatics/btw115. Epub 2016 Mar 2.
9
Semi-supervised learning improves gene expression-based prediction of cancer recurrence.
Bioinformatics. 2011 Nov 1;27(21):3017-23. doi: 10.1093/bioinformatics/btr502. Epub 2011 Sep 4.
10
Training with Small Medical Data: Robust Bayesian Neural Networks for Colon Cancer Overall Survival Prediction.
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2030-2033. doi: 10.1109/EMBC46164.2021.9630698.

引用本文的文献

1
Role of Generative Artificial Intelligence in Personalized Medicine: A Systematic Review.
Cureus. 2025 Apr 15;17(4):e82310. doi: 10.7759/cureus.82310. eCollection 2025 Apr.
2
Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients.
Cell Rep Methods. 2024 Jul 15;4(7):100817. doi: 10.1016/j.crmeth.2024.100817. Epub 2024 Jul 8.
3
A review of multi-omics data integration through deep learning approaches for disease diagnosis, prognosis, and treatment.
Front Genet. 2023 Jul 20;14:1199087. doi: 10.3389/fgene.2023.1199087. eCollection 2023.

本文引用的文献

1
Ridge Regularization: An Essential Concept in Data Science.
Technometrics. 2020;62(4):426-433. doi: 10.1080/00401706.2020.1791959. Epub 2020 Aug 10.
2
Training with Small Medical Data: Robust Bayesian Neural Networks for Colon Cancer Overall Survival Prediction.
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2030-2033. doi: 10.1109/EMBC46164.2021.9630698.
4
Generative Adversarial Networks for Robust Breast Cancer Prognosis Prediction with Limited Data Size.
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5669-5672. doi: 10.1109/EMBC44109.2020.9175736.
7
A Selective Review of Multi-Level Omics Data Integration Using Variable Selection.
High Throughput. 2019 Jan 18;8(1):4. doi: 10.3390/ht8010004.
8
Cancer statistics, 2019.
CA Cancer J Clin. 2019 Jan;69(1):7-34. doi: 10.3322/caac.21551. Epub 2019 Jan 8.
9
A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data.
IEEE/ACM Trans Comput Biol Bioinform. 2018 Feb 15. doi: 10.1109/TCBB.2018.2806438.
10
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.
IEEE J Biomed Health Inform. 2018 Sep;22(5):1589-1604. doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验