School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, 30308, GA, USA.
Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo, Tokyo, Japan.
Genome Biol. 2022 Jun 27;23(1):139. doi: 10.1186/s13059-022-02706-x.
It is a challenging task to integrate scRNA-seq and scATAC-seq data obtained from different batches. Existing methods tend to use a pre-defined gene activity matrix to convert the scATAC-seq data into scRNA-seq data. The pre-defined gene activity matrix is often of low quality and does not reflect the dataset-specific relationship between the two data modalities. We propose scDART, a deep learning framework that integrates scRNA-seq and scATAC-seq data and learns cross-modalities relationships simultaneously. Specifically, the design of scDART allows it to preserve cell trajectories in continuous cell populations and can be applied to trajectory inference on integrated data.
将来自不同批次的 scRNA-seq 和 scATAC-seq 数据整合在一起是一项具有挑战性的任务。现有的方法往往使用预定义的基因活性矩阵将 scATAC-seq 数据转换为 scRNA-seq 数据。然而,这种预定义的基因活性矩阵通常质量较低,并且不能反映两种数据模式之间的数据集特定关系。我们提出了 scDART,这是一种深度学习框架,可整合 scRNA-seq 和 scATAC-seq 数据,并同时学习跨模态关系。具体来说,scDART 的设计使其能够在连续的细胞群体中保留细胞轨迹,并且可以应用于整合数据的轨迹推断。