Chen Shuxiao, Zhu Bokai, Huang Sijia, Hickey John W, Lin Kevin Z, Snyder Michael, Greenleaf William J, Nolan Garry P, Zhang Nancy R, Ma Zongming
bioRxiv. 2023 Jan 16:2023.01.12.523851. doi: 10.1101/2023.01.12.523851.
single-cell sequencing methods have enabled the profiling of multiple types of molecular readouts at cellular resolution, and recent developments in spatial barcoding, in situ hybridization, and in situ sequencing allow such molecular readouts to retain their spatial context. Since no technology can provide complete characterization across all layers of biological modalities within the same cell, there is pervasive need for computational cross-modal integration (also called diagonal integration) of single-cell and spatial omics data. For current methods, the feasibility of cross-modal integration relies on the existence of highly correlated, a priori "linked" features. When such linked features are few or uninformative, a scenario that we call "weak linkage", existing methods fail. We developed MaxFuse, a cross-modal data integration method that, through iterative co-embedding, data smoothing, and cell matching, leverages all information in each modality to obtain high-quality integration. MaxFuse is modality-agnostic and, through comprehensive benchmarks on single-cell and spatial ground-truth multiome datasets, demonstrates high robustness and accuracy in the weak linkage scenario. A prototypical example of weak linkage is the integration of spatial proteomic data with single-cell sequencing data. On two example analyses of this type, we demonstrate how MaxFuse enables the spatial consolidation of proteomic, transcriptomic and epigenomic information at single-cell resolution on the same tissue section.
单细胞测序方法能够在细胞分辨率下对多种类型的分子读数进行分析,而空间条形码、原位杂交和原位测序等最新技术进展使得这些分子读数能够保留其空间背景信息。由于没有任何技术能够在同一细胞内对所有生物模态层进行完整表征,因此对单细胞和空间组学数据进行计算跨模态整合(也称为对角整合)的需求普遍存在。对于当前的方法,跨模态整合的可行性依赖于高度相关的、先验的“关联”特征的存在。当这种关联特征很少或没有信息时,即我们所说的“弱关联”情况,现有方法就会失效。我们开发了MaxFuse,这是一种跨模态数据整合方法,通过迭代共嵌入、数据平滑和细胞匹配,利用每个模态中的所有信息来获得高质量的整合。MaxFuse与模态无关,并且通过对单细胞和空间真实多组学数据集的全面基准测试,在弱关联情况下展示了高稳健性和准确性。弱关联的一个典型例子是空间蛋白质组学数据与单细胞测序数据的整合。在两个此类示例分析中,我们展示了MaxFuse如何在同一组织切片上以单细胞分辨率实现蛋白质组学、转录组学和表观基因组学信息的空间整合。