Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
Cellular and Molecular Biology Program, University of Michigan, Ann Arbor, MI, USA.
Nat Biotechnol. 2023 Mar;41(3):387-398. doi: 10.1038/s41587-022-01476-y. Epub 2022 Oct 13.
Multi-omic single-cell datasets, in which multiple molecular modalities are profiled within the same cell, offer an opportunity to understand the temporal relationship between epigenome and transcriptome. To realize this potential, we developed MultiVelo, a differential equation model of gene expression that extends the RNA velocity framework to incorporate epigenomic data. MultiVelo uses a probabilistic latent variable model to estimate the switch time and rate parameters of chromatin accessibility and gene expression and improves the accuracy of cell fate prediction compared to velocity estimates from RNA only. Application to multi-omic single-cell datasets from brain, skin and blood cells reveals two distinct classes of genes distinguished by whether chromatin closes before or after transcription ceases. We also find four types of cell states: two states in which epigenome and transcriptome are coupled and two distinct decoupled states. Finally, we identify time lags between transcription factor expression and binding site accessibility and between disease-associated SNP accessibility and expression of the linked genes. MultiVelo is available on PyPI, Bioconda and GitHub ( https://github.com/welch-lab/MultiVelo ).
多组学单细胞数据集,其中在同一细胞内对多个分子模式进行分析,提供了一个理解表观基因组和转录组之间时间关系的机会。为了实现这一潜力,我们开发了 MultiVelo,这是一种基因表达的微分方程模型,它扩展了 RNA 速度框架,将表观基因组数据纳入其中。MultiVelo 使用概率潜在变量模型来估计染色质可及性和基因表达的开关时间和速率参数,与仅基于 RNA 的速度估计相比,提高了细胞命运预测的准确性。对来自大脑、皮肤和血细胞的多组学单细胞数据集的应用揭示了两类不同的基因,它们的区别在于染色质在转录停止之前还是之后关闭。我们还发现了四种细胞状态:两种染色质和转录组耦合的状态和两种不同的去耦状态。最后,我们确定了转录因子表达和结合位点可及性之间以及与疾病相关的 SNP 可及性和相关基因表达之间的时间滞后。MultiVelo 可在 PyPI、Bioconda 和 GitHub(https://github.com/welch-lab/MultiVelo)上获得。