Carilli Maria, Gorin Gennady, Choi Yongin, Chari Tara, Pachter Lior
Division of Biology and Biological Engineering, California Institute of Technology.
Division of Chemistry and Chemical Engineering, California Institute of Technology.
bioRxiv. 2023 May 2:2023.01.13.523995. doi: 10.1101/2023.01.13.523995.
We motivate and present , which combines the variational autoencoder framework of with biophysically motivated, bivariate models for nascent and mature RNA distributions. While previous approaches to integrate bimodal data via the variational autoencoder framework ignore the causal relationship between measurements, models the biophysical processes that give rise to observations. We demonstrate through simulated benchmarking that captures cell type structure in a low-dimensional space and accurately recapitulates parameter values and copy number distributions. On biological data, provides a scalable route for identifying the biophysical mechanisms underlying gene expression. This analytical approach outlines a generalizable strategy for treating multimodal datasets generated by high-throughput, single-cell genomic assays.
我们推动并展示了一种方法,它将[具体文献中提及的]变分自编码器框架与用于新生和成熟RNA分布的具有生物物理动机的双变量模型相结合。虽然以前通过变分自编码器框架整合双峰数据的方法忽略了测量之间的因果关系,但[该方法]对产生观测结果的生物物理过程进行了建模。我们通过模拟基准测试证明,[该方法]在低维空间中捕捉细胞类型结构,并准确概括参数值和拷贝数分布。在生物学数据上,[该方法]为识别基因表达背后的生物物理机制提供了一条可扩展的途径。这种分析方法概述了一种可推广的策略,用于处理由高通量单细胞基因组分析产生的多模态数据集。