从少量医学数据中学习——基于贝叶斯变分自编码器的稳健半监督癌症预后分类器

Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder.

作者信息

Hsu Te-Cheng, Lin Che

机构信息

Institute of Communications Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan.

Graduate Institute of Communication Engineering, National Taiwan University, Taipei 10617, Taiwan.

出版信息

Bioinform Adv. 2023 Jan 9;3(1):vbac100. doi: 10.1093/bioadv/vbac100. eCollection 2023.

DOI:10.1093/bioadv/vbac100

PMID:36698767

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9832968/

Abstract

MOTIVATION

Cancer is one of the world's leading mortality causes, and its prognosis is hard to predict due to complicated biological interactions among heterogeneous data types. Numerous challenges, such as censorship, high dimensionality and small sample size, prevent researchers from using deep learning models for precise prediction.

RESULTS

We propose a robust Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder () as a structured machine-learning framework for cancer prognosis prediction. incorporates semi-supervised learning for predicting 5-year disease-specific survival and overall survival in breast and non-small cell lung cancer (NSCLC) patients, respectively. achieved significantly better AUROC scores than all existing benchmarks (81.73% for breast cancer; 80.46% for NSCLC), including our previously proposed bimodal neural network classifiers (77.71% for breast cancer; 78.67% for NSCLC). Independent validation results showed that still achieved better AUROC scores (74.74% for breast; 72.80% for NSCLC) than the bimodal neural network classifiers (64.13% for breast; 67.07% for NSCLC). is general and can potentially be trained on more patient data. This paves the foundation for personalized medicine for early cancer risk screening.

AVAILABILITY AND IMPLEMENTATION

The source codes reproducing the main results are available on GitHub: https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

癌症是全球主要的死亡原因之一，由于异质数据类型之间复杂的生物相互作用，其预后难以预测。诸多挑战，如删失、高维度和小样本量，阻碍了研究人员使用深度学习模型进行精确预测。

结果

我们提出了一种稳健的半监督癌症预后分类器，即带有贝叶斯变分自编码器的（），作为用于癌症预后预测的结构化机器学习框架。分别将半监督学习纳入预测乳腺癌和非小细胞肺癌（NSCLC）患者的5年疾病特异性生存率和总生存率。与所有现有基准相比，取得了显著更高的AUROC分数（乳腺癌为81.73%；NSCLC为80.46%），包括我们之前提出的双峰神经网络分类器（乳腺癌为77.71%；NSCLC为78.67%）。独立验证结果表明，与双峰神经网络分类器（乳腺癌为64.13%；NSCLC为67.07%）相比，仍取得了更好的AUROC分数（乳腺癌为74.74%；NSCLC为72.80%）。具有通用性，有可能在更多患者数据上进行训练。这为早期癌症风险筛查的个性化医疗奠定了基础。

可用性和实现

重现主要结果的源代码可在GitHub上获取：https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/。

补充信息

补充数据可在在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc94/9832968/e147697f0412/vbac100f1.jpg

相似文献

Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder.从少量医学数据中学习——基于贝叶斯变分自编码器的稳健半监督癌症预后分类器

Bioinform Adv. 2023 Jan 9;3(1):vbac100. doi: 10.1093/bioadv/vbac100. eCollection 2023.

Prediction of mutation effects using a deep temporal convolutional network.使用深度时间卷积网络预测突变效应。

Bioinformatics. 2020 Apr 1;36(7):2047-2052. doi: 10.1093/bioinformatics/btz873.

Prognostic outcome prediction by semi-supervised least squares classification.基于半监督最小二乘法分类的预后结局预测。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa249.

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在（放化疗）治疗结果预测中的应用：分类器的实证比较。

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction.具有预训练蛋白质嵌入的贝叶斯神经网络提高了药物-蛋白质相互作用的预测准确性。

Bioinformatics. 2021 Oct 25;37(20):3428-3435. doi: 10.1093/bioinformatics/btab346.

Generative Adversarial Networks for Robust Breast Cancer Prognosis Prediction with Limited Data Size.用于在数据量有限的情况下进行稳健乳腺癌预后预测的生成对抗网络

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5669-5672. doi: 10.1109/EMBC44109.2020.9175736.

Semi-supervised automatic seizure detection using personalized anomaly detecting variational autoencoder with behind-the-ear EEG.基于耳后的 EEG 使用个性化异常检测变分自动编码器的半监督自动癫痫发作检测。

Comput Methods Programs Biomed. 2022 Jan;213:106542. doi: 10.1016/j.cmpb.2021.106542. Epub 2021 Nov 17.

MetaKTSP: a meta-analytic top scoring pair method for robust cross-study validation of omics prediction analysis.MetaKTSP：一种用于组学预测分析的稳健跨研究验证的元分析最高得分对方法。

Bioinformatics. 2016 Jul 1;32(13):1966-73. doi: 10.1093/bioinformatics/btw115. Epub 2016 Mar 2.

Semi-supervised learning improves gene expression-based prediction of cancer recurrence.半监督学习提高了基于基因表达的癌症复发预测。

Bioinformatics. 2011 Nov 1;27(21):3017-23. doi: 10.1093/bioinformatics/btr502. Epub 2011 Sep 4.

Training with Small Medical Data: Robust Bayesian Neural Networks for Colon Cancer Overall Survival Prediction.小医学数据训练：结肠癌总体生存预测的稳健贝叶斯神经网络。

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2030-2033. doi: 10.1109/EMBC46164.2021.9630698.

引用本文的文献

Role of Generative Artificial Intelligence in Personalized Medicine: A Systematic Review.生成式人工智能在个性化医疗中的作用：一项系统综述。

Cureus. 2025 Apr 15;17(4):e82310. doi: 10.7759/cureus.82310. eCollection 2025 Apr.

Cross-attention enables deep learning on limited omics-imaging-clinical data of 130 lung cancer patients.跨注意力使深度学习能够利用 130 名肺癌患者的有限组学-影像-临床数据。

Cell Rep Methods. 2024 Jul 15;4(7):100817. doi: 10.1016/j.crmeth.2024.100817. Epub 2024 Jul 8.

A review of multi-omics data integration through deep learning approaches for disease diagnosis, prognosis, and treatment.通过深度学习方法进行多组学数据整合以用于疾病诊断、预后和治疗的综述。

Front Genet. 2023 Jul 20;14:1199087. doi: 10.3389/fgene.2023.1199087. eCollection 2023.

本文引用的文献

Ridge Regularization: An Essential Concept in Data Science.岭回归正则化：数据科学中的一个重要概念。

Technometrics. 2020;62(4):426-433. doi: 10.1080/00401706.2020.1791959. Epub 2020 Aug 10.

Training with Small Medical Data: Robust Bayesian Neural Networks for Colon Cancer Overall Survival Prediction.小医学数据训练：结肠癌总体生存预测的稳健贝叶斯神经网络。

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2030-2033. doi: 10.1109/EMBC46164.2021.9630698.

Integrating ensemble systems biology feature selection and bimodal deep neural network for breast cancer prognosis prediction.集成集成系统生物学特征选择和双模态深度神经网络用于乳腺癌预后预测。

Sci Rep. 2021 Jul 21;11(1):14914. doi: 10.1038/s41598-021-92864-y.

Generative Adversarial Networks for Robust Breast Cancer Prognosis Prediction with Limited Data Size.用于在数据量有限的情况下进行稳健乳腺癌预后预测的生成对抗网络

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:5669-5672. doi: 10.1109/EMBC44109.2020.9175736.

Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with deep learning.利用深度学习整合微阵列和临床数据对非小细胞肺癌进行总体生存预测。

Sci Rep. 2020 Mar 13;10(1):4679. doi: 10.1038/s41598-020-61588-w.

ALCAM contributes to brain metastasis formation in non-small-cell lung cancer through interaction with the vascular endothelium.ALCAM 通过与血管内皮相互作用促进非小细胞肺癌的脑转移形成。

Neuro Oncol. 2020 Jul 7;22(7):955-966. doi: 10.1093/neuonc/noaa028.

A Selective Review of Multi-Level Omics Data Integration Using Variable Selection.使用变量选择对多组学数据整合进行的选择性综述

High Throughput. 2019 Jan 18;8(1):4. doi: 10.3390/ht8010004.

Cancer statistics, 2019.癌症统计数据，2019 年。

CA Cancer J Clin. 2019 Jan;69(1):7-34. doi: 10.3322/caac.21551. Epub 2019 Jan 8.

A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data.一种通过整合多维数据进行人类乳腺癌预后预测的多模态深度神经网络。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Feb 15. doi: 10.1109/TCBB.2018.2806438.

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.深度电子健康记录（EHR）：深度学习技术在电子健康记录（EHR）分析中的最新进展综述。

IEEE J Biomed Health Inform. 2018 Sep;22(5):1589-1604. doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从少量医学数据中学习——基于贝叶斯变分自编码器的稳健半监督癌症预后分类器

Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献