Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
Genome Biol. 2022 May 9;23(1):112. doi: 10.1186/s13059-022-02679-x.
Integration of single-cell multiomics profiles generated by different single-cell technologies from the same biological sample is still challenging. Previous approaches based on shared features have only provided approximate solutions. Here, we present a novel mathematical solution named bi-order canonical correlation analysis (bi-CCA), which extends the widely used CCA approach to iteratively align the rows and the columns between data matrices. Bi-CCA is generally applicable to combinations of any two single-cell modalities. Validations using co-assayed ground truth data and application to a CAR-NK study and a fetal muscle atlas demonstrate its capability in generating accurate multimodal co-embeddings and discovering cellular identity.
单细胞多组学数据整合仍然具有挑战性,这些数据由同一样本的不同单细胞技术生成。先前基于共享特征的方法仅提供了近似的解决方案。在这里,我们提出了一种新的数学解决方案,名为双阶典范相关分析(bi-CCA),它扩展了广泛使用的 CCA 方法,以迭代地对齐数据矩阵的行和列。bi-CCA 通常适用于任何两种单细胞模式的组合。使用共同测定的真实数据进行验证,并将其应用于 CAR-NK 研究和胎儿肌肉图谱,证明了它在生成准确的多模态协同嵌入和发现细胞身份方面的能力。