一种将组织病理学图像与转录组学相连接的视觉组学基础模型。

A visual-omics foundation model to bridge histopathology image with transcriptomics.

作者信息

Chen Weiqing, Zhang Pengzhi, Tran Tu N, Xiao Yiwei, Li Shengyu, Shah Vrutant V, Cheng Hao, Brannan Kristopher W, Youker Keith, Li Lai, Fang Longhou, Yang Yu, Le Nhat-Tu, Abe Jun-Ichi, Chen Shu-Hsia, Ma Qin, Chen Ken, Song Qianqian, Cooke John P, Wang Guangyu

机构信息

Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, 77030, USA.

Department of Physiology, Biophysics & Systems Biology, Weill Cornell Graduate School of Medical Science, Cornell University, New York, NY, 10065, USA.

出版信息

Res Sq. 2025 Apr 16:rs.3.rs-5183775. doi: 10.21203/rs.3.rs-5183775/v1.

DOI:10.21203/rs.3.rs-5183775/v1

PMID:40321764

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12047990/

Abstract

Artificial intelligence has revolutionized computational biology. Recent developments in omics technologies, including single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST), provide detailed genomic data alongside tissue histology. However, current computational models focus on either omics or image analysis, lacking their integration. To address this, we developed OmiCLIP, a visual-omics foundation model linking hematoxylin and eosin (H&E) images and transcriptomics using tissue patches from Visium data. We transformed transcriptomic data into "sentences" by concatenating top-expressed gene symbols from each patch. We curated a dataset of 2.2 million paired tissue images and transcriptomic data across 32 organs to train OmiCLIP integrating histology and transcriptomics. Building on OmiCLIP, our Loki platform offers five key functions: tissue alignment, annotation via bulk RNA-seq or marker genes, cell type decomposition, image-transcriptomics retrieval, and ST gene expression prediction from H&E images. Compared with 22 state-of-the-art models on 5 simulations, 19 public, and 4 in-house experimental datasets, Loki demonstrated consistent accuracy and robustness.

摘要

人工智能已经彻底改变了计算生物学。组学技术的最新进展，包括单细胞RNA测序（scRNA-seq）和空间转录组学（ST），在提供组织组织学信息的同时，还能提供详细的基因组数据。然而，目前的计算模型要么专注于组学，要么专注于图像分析，缺乏两者的整合。为了解决这个问题，我们开发了OmiCLIP，这是一个视觉组学基础模型，它使用来自Visium数据的组织切片将苏木精和伊红（H&E）图像与转录组学联系起来。我们通过串联每个切片中表达量最高的基因符号，将转录组数据转化为“句子”。我们精心策划了一个包含220万个配对组织图像和来自32个器官的转录组数据的数据集，以训练整合组织学和转录组学的OmiCLIP。基于OmiCLIP，我们的Loki平台提供五项关键功能：组织对齐、通过批量RNA测序或标记基因进行注释、细胞类型分解、图像-转录组学检索以及从H&E图像预测ST基因表达。在5个模拟数据集、19个公共数据集和4个内部实验数据集上，与22个最先进的模型相比，Loki表现出了一致的准确性和稳健性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/97f9/12047990/cfaea83fdd19/nihpp-rs5183775v1-f0001.jpg

相似文献

A visual-omics foundation model to bridge histopathology image with transcriptomics.

Res Sq. 2025 Apr 16:rs.3.rs-5183775. doi: 10.21203/rs.3.rs-5183775/v1.

A visual-omics foundation model to bridge histopathology with spatial transcriptomics.

Nat Methods. 2025 May 29. doi: 10.1038/s41592-025-02707-1.

ScInfeR: an efficient method for annotating cell types and sub-types in single-cell RNA-seq, ATAC-seq, and spatial omics.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf253.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

stGNN: Spatially Informed Cell-Type Deconvolution Based on Deep Graph Learning and Statistical Modeling.

Interdiscip Sci. 2025 Jun 26. doi: 10.1007/s12539-025-00728-0.

Linking transcriptome and morphology in bone cells at cellular resolution with generative AI.

J Bone Miner Res. 2024 Dec 31;40(1):20-26. doi: 10.1093/jbmr/zjae151.

Inference of single cell profiles from histology stains with the Single Cell omics from Histology Analysis Framework (SCHAF).

bioRxiv. 2025 Jun 13:2023.03.21.533680. doi: 10.1101/2023.03.21.533680.

Single-cell analysis comparing early-stage oocytes from fresh and slow-frozen/thawed human ovarian cortex reveals minimal impact of cryopreservation on the oocyte transcriptome.

Hum Reprod. 2025 Apr 1;40(4):683-694. doi: 10.1093/humrep/deaf009.

Progress and applications of single-cell RNA sequencing and spatial transcriptome technology in acute kidney injury research.

Mol Ther Nucleic Acids. 2025 May 30;36(3):102583. doi: 10.1016/j.omtn.2025.102583. eCollection 2025 Sep 9.

Influence of early through late fusion on pancreas segmentation from imperfectly registered multimodal magnetic resonance imaging.

J Med Imaging (Bellingham). 2025 Mar;12(2):024008. doi: 10.1117/1.JMI.12.2.024008. Epub 2025 Apr 26.

本文引用的文献

Simple and effective embedding model for single-cell biology built from ChatGPT.

Nat Biomed Eng. 2025 Apr;9(4):483-493. doi: 10.1038/s41551-024-01284-6. Epub 2024 Dec 6.

Evaluation of large language models for discovery of gene set function.

Nat Methods. 2025 Jan;22(1):82-91. doi: 10.1038/s41592-024-02525-x. Epub 2024 Nov 28.

Multimodal contrastive learning for spatial gene expression prediction using histology images.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae551.

Search and match across spatial omics samples at single-cell resolution.

Nat Methods. 2024 Oct;21(10):1818-1829. doi: 10.1038/s41592-024-02410-7. Epub 2024 Sep 18.

Open-ST: High-resolution spatial transcriptomics in 3D.

Cell. 2024 Jul 25;187(15):3953-3972.e26. doi: 10.1016/j.cell.2024.05.055. Epub 2024 Jun 24.

Large-scale foundation model on single-cell transcriptomics.

Nat Methods. 2024 Aug;21(8):1481-1491. doi: 10.1038/s41592-024-02305-7. Epub 2024 Jun 6.

A whole-slide foundation model for digital pathology from real-world data.

Nature. 2024 Jun;630(8015):181-188. doi: 10.1038/s41586-024-07441-w. Epub 2024 May 22.

Analysis of 3D pathology samples using weakly supervised AI.

Cell. 2024 May 9;187(10):2502-2520.e17. doi: 10.1016/j.cell.2024.03.035.

Vision-language foundation model for echocardiogram interpretation.

Nat Med. 2024 May;30(5):1481-1488. doi: 10.1038/s41591-024-02959-y. Epub 2024 Apr 30.

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis.

Nat Methods. 2024 Aug;21(8):1462-1465. doi: 10.1038/s41592-024-02235-4. Epub 2024 Mar 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种将组织病理学图像与转录组学相连接的视觉组学基础模型。

A visual-omics foundation model to bridge histopathology image with transcriptomics.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献