Department of Automation, Shanghai Jiao Tong University, USA.
FORE Systems Professor of Computational Biology and Machine Learning at CMU, USA.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab142.
Time-course gene-expression data have been widely used to infer regulatory and signaling relationships between genes. Most of the widely used methods for such analysis were developed for bulk expression data. Single cell RNA-Seq (scRNA-Seq) data offer several advantages including the large number of expression profiles available and the ability to focus on individual cells rather than averages. However, the data also raise new computational challenges. Using a novel encoding for scRNA-Seq expression data, we develop deep learning methods for interaction prediction from time-course data. Our methods use a supervised framework which represents the data as 3D tensor and train convolutional and recurrent neural networks for predicting interactions. We tested our time-course deep learning (TDL) models on five different time-series scRNA-Seq datasets. As we show, TDL can accurately identify causal and regulatory gene-gene interactions and can also be used to assign new function to genes. TDL improves on prior methods for the above tasks and can be generally applied to new time-series scRNA-Seq data.
时间序列基因表达数据已被广泛用于推断基因之间的调控和信号关系。大多数用于此类分析的常用方法都是为批量表达数据开发的。单细胞 RNA-Seq(scRNA-Seq)数据具有多个优势,包括可获得大量表达谱的能力,以及关注单个细胞而不是平均值的能力。然而,这些数据也带来了新的计算挑战。我们使用 scRNA-Seq 表达数据的新颖编码,为从时间序列数据中进行交互预测开发了深度学习方法。我们的方法使用了一个有监督的框架,将数据表示为 3D 张量,并训练卷积和递归神经网络来预测交互。我们在五个不同的时间序列 scRNA-Seq 数据集上测试了我们的时间序列深度学习(TDL)模型。正如我们所展示的,TDL 可以准确识别因果和调节基因-基因相互作用,并且还可以用于为基因分配新功能。TDL 改进了上述任务的先前方法,并且可以一般应用于新的时间序列 scRNA-Seq 数据。