通过级联扩散模型从RNA测序数据生成肿瘤的合成全切片图像块

Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models.

作者信息

Carrillo-Perez Francisco, Pizurica Marija, Zheng Yuanning, Nandi Tarak Nath, Madduri Ravi, Shen Jeanne, Gevaert Olivier

机构信息

Stanford Center for Biomedical Informatics Research (BMIR), Stanford University, School of Medicine, Stanford, CA, USA.

Internet technology and Data science Lab (IDLab), Ghent University, Ghent, Belgium.

出版信息

Nat Biomed Eng. 2025 Mar;9(3):320-332. doi: 10.1038/s41551-024-01193-8. Epub 2024 Mar 21.

DOI:10.1038/s41551-024-01193-8

PMID:38514775

Abstract

Training machine-learning models with synthetically generated data can alleviate the problem of data scarcity when acquiring diverse and sufficiently large datasets is costly and challenging. Here we show that cascaded diffusion models can be used to synthesize realistic whole-slide image tiles from latent representations of RNA-sequencing data from human tumours. Alterations in gene expression affected the composition of cell types in the generated synthetic image tiles, which accurately preserved the distribution of cell types and maintained the cell fraction observed in bulk RNA-sequencing data, as we show for lung adenocarcinoma, kidney renal papillary cell carcinoma, cervical squamous cell carcinoma, colon adenocarcinoma and glioblastoma. Machine-learning models pretrained with the generated synthetic data performed better than models trained from scratch. Synthetic data may accelerate the development of machine-learning models in scarce-data settings and allow for the imputation of missing data modalities.

摘要

在获取多样且足够大的数据集成本高昂且具有挑战性时，使用合成生成的数据训练机器学习模型可以缓解数据稀缺问题。在此，我们表明级联扩散模型可用于从人类肿瘤的RNA测序数据的潜在表示中合成逼真的全切片图像块。基因表达的改变影响了生成的合成图像块中的细胞类型组成，正如我们在肺腺癌、肾肾乳头状细胞癌、宫颈鳞状细胞癌、结肠腺癌和胶质母细胞瘤中所展示的那样，其准确保留了细胞类型的分布并维持了在批量RNA测序数据中观察到的细胞比例。用生成的合成数据进行预训练的机器学习模型比从头开始训练的模型表现更好。合成数据可能会加速稀缺数据环境中机器学习模型的开发，并允许对缺失的数据模态进行插补。

相似文献

Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models.通过级联扩散模型从RNA测序数据生成肿瘤的合成全切片图像块

Nat Biomed Eng. 2025 Mar;9(3):320-332. doi: 10.1038/s41551-024-01193-8. Epub 2024 Mar 21.

RNA-to-image multi-cancer synthesis using cascaded diffusion models.使用级联扩散模型的RNA到图像的多癌合成

bioRxiv. 2023 Jul 10:2023.01.13.523899. doi: 10.1101/2023.01.13.523899.

A whole-slide foundation model for digital pathology from real-world data.基于真实世界数据的全幻灯片数字病理学基础模型。

Nature. 2024 Jun;630(8015):181-188. doi: 10.1038/s41586-024-07441-w. Epub 2024 May 22.

OCTID: a one-class learning-based Python package for tumor image detection.OCTID：一个基于单类学习的用于肿瘤图像检测的Python软件包。

Bioinformatics. 2021 Nov 5;37(21):3986-3988. doi: 10.1093/bioinformatics/btab416.

Synthetic whole-slide image tile generation with gene expression profile-infused deep generative models.基于基因表达谱融合的深度生成模型的合成全幻灯片图像瓦片生成。

Cell Rep Methods. 2023 Jul 19;3(8):100534. doi: 10.1016/j.crmeth.2023.100534. eCollection 2023 Aug 28.

SeLa-MIL: Developing an instance-level classifier via weakly-supervised self-training for whole slide image classification.SeLa-MIL：通过弱监督自训练开发用于全幻灯片图像分类的实例级分类器。

Comput Methods Programs Biomed. 2025 Apr;261:108614. doi: 10.1016/j.cmpb.2025.108614. Epub 2025 Jan 27.

A Novel Framework for Whole-Slide Pathological Image Classification Based on the Cascaded Attention Mechanism.一种基于级联注意力机制的全切片病理图像分类新框架。

Sensors (Basel). 2025 Jan 25;25(3):726. doi: 10.3390/s25030726.

Using diffusion models to generate synthetic labeled data for medical image segmentation.使用扩散模型生成医学图像分割的合成标记数据。

Int J Comput Assist Radiol Surg. 2024 Aug;19(8):1615-1625. doi: 10.1007/s11548-024-03213-z. Epub 2024 Jun 20.

DiffuSeg: Domain-Driven Diffusion for Medical Image Segmentation.DiffuSeg：用于医学图像分割的领域驱动扩散算法

IEEE J Biomed Health Inform. 2025 May;29(5):3619-3631. doi: 10.1109/JBHI.2025.3526806. Epub 2025 May 6.

Unsupervised mutual transformer learning for multi-gigapixel Whole Slide Image classification.无监督的多千兆像素全幻灯片图像分类的互变压器学习。

Med Image Anal. 2024 Aug;96:103203. doi: 10.1016/j.media.2024.103203. Epub 2024 May 21.

引用本文的文献

Evaluating Vision and Pathology Foundation Models for Computational Pathology: A Comprehensive Benchmark Study.评估用于计算病理学的视觉与病理学基础模型：一项全面的基准研究

Res Sq. 2025 Jul 4:rs.3.rs-6823810. doi: 10.21203/rs.3.rs-6823810/v1.

Efficient merging and validation of deep learning-based nuclei segmentations in H&E slides from multiple models.高效合并和验证来自多个模型的苏木精和伊红（H&E）染色切片中基于深度学习的细胞核分割结果

J Pathol Inform. 2025 Apr 15;17:100443. doi: 10.1016/j.jpi.2025.100443. eCollection 2025 Apr.

Pixel super-resolved virtual staining of label-free tissue using diffusion models.使用扩散模型对无标记组织进行像素超分辨虚拟染色。

Nat Commun. 2025 May 30;16(1):5016. doi: 10.1038/s41467-025-60387-z.

Implementing Trust in Non-Small Cell Lung Cancer Diagnosis with a Conformalized Uncertainty-Aware AI Framework in Whole-Slide Images.在全切片图像中使用共形不确定性感知人工智能框架实现非小细胞肺癌诊断中的信任

Res Sq. 2025 Mar 27:rs.3.rs-5723270. doi: 10.21203/rs.3.rs-5723270/v1.

Autonomous learning of pathologists' cancer grading rules.病理学家癌症分级规则的自主学习

bioRxiv. 2025 Apr 7:2025.03.18.643999. doi: 10.1101/2025.03.18.643999.

Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis.人工智能驱动的生物医学多模态数据融合与分析中的挑战。

Genomics Proteomics Bioinformatics. 2025 May 10;23(1). doi: 10.1093/gpbjnl/qzaf011.

Towards generative digital twins in biomedical research.迈向生物医学研究中的生成式数字孪生体。

Comput Struct Biotechnol J. 2024 Oct 3;23:3481-3488. doi: 10.1016/j.csbj.2024.09.030. eCollection 2024 Dec.

Bias in artificial intelligence for medical imaging: fundamentals, detection, avoidance, mitigation, challenges, ethics, and prospects.医学成像人工智能中的偏差：基础、检测、避免、缓解、挑战、伦理及前景

Diagn Interv Radiol. 2025 Mar 3;31(2):75-88. doi: 10.4274/dir.2024.242854. Epub 2024 Jul 2.

本文引用的文献

Multimodal deep learning to predict prognosis in adult and pediatric brain tumors.多模态深度学习用于预测成人和儿童脑肿瘤的预后。

Commun Med (Lond). 2023 Mar 29;3(1):44. doi: 10.1038/s43856-023-00276-y.

Single-cell spatial immune landscapes of primary and metastatic brain tumours.原发性和转移性脑肿瘤的单细胞空间免疫图谱。

Nature. 2023 Feb;614(7948):555-563. doi: 10.1038/s41586-022-05680-3. Epub 2023 Feb 1.

Artificial intelligence for multimodal data integration in oncology.人工智能在肿瘤学中用于多模态数据整合。

Cancer Cell. 2022 Oct 10;40(10):1095-1110. doi: 10.1016/j.ccell.2022.09.012.

Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer.多模态影像学、病理学和基因组学综合分析预测非小细胞肺癌患者对 PD-(L)1 阻断治疗的反应。

Nat Cancer. 2022 Oct;3(10):1151-1164. doi: 10.1038/s43018-022-00416-8. Epub 2022 Aug 29.

Pan-cancer integrative histology-genomic analysis via multimodal deep learning.基于多模态深度学习的泛癌综合组织学-基因组分析。

Cancer Cell. 2022 Aug 8;40(8):865-878.e6. doi: 10.1016/j.ccell.2022.07.004.

Bridging the gap with the UK Genomics Pathology Imaging Collection.与英国基因组病理学影像库接轨。

Nat Med. 2022 Jun;28(6):1107-1108. doi: 10.1038/s41591-022-01798-z.

Machine-Learning-Based Late Fusion on Multi-Omics and Multi-Scale Data for Non-Small-Cell Lung Cancer Diagnosis.基于机器学习的多组学和多尺度数据晚期融合用于非小细胞肺癌诊断

J Pers Med. 2022 Apr 8;12(4):601. doi: 10.3390/jpm12040601.

Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis.泛癌计算组织病理学揭示了突变、肿瘤组成和预后。

Nat Cancer. 2020 Aug;1(8):800-810. doi: 10.1038/s43018-020-0085-8. Epub 2020 Jul 27.

High resolution histopathology image generation and segmentation through adversarial training.通过对抗训练生成和分割高分辨率组织病理学图像。

Med Image Anal. 2022 Jan;75:102251. doi: 10.1016/j.media.2021.102251. Epub 2021 Nov 3.

Comprehensive molecular characterization of lung tumors implicates AKT and MYC signaling in adenocarcinoma to squamous cell transdifferentiation.全面的肺肿瘤分子特征分析提示 AKT 和 MYC 信号通路在腺癌至鳞癌的转化中发挥作用。

J Hematol Oncol. 2021 Oct 16;14(1):170. doi: 10.1186/s13045-021-01186-z.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过级联扩散模型从RNA测序数据生成肿瘤的合成全切片图像块

Generation of synthetic whole-slide image tiles of tumours from RNA-sequencing data via cascaded diffusion models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献