Department of Mathematics, Colgate University, Hamilton NY, USA.
Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, USA.
Biometrics. 2023 Jun;79(2):1559-1572. doi: 10.1111/biom.13701. Epub 2022 Jun 22.
With recent advances in technologies to profile multi-omics data at the single-cell level, integrative multi-omics data analysis has been increasingly popular. It is increasingly common that information such as methylation changes, chromatin accessibility, and gene expression are jointly collected in a single-cell experiment. In biomedical studies, it is often of interest to study the associations between various data types and to examine how these associations might change according to other factors such as cell types and gene regulatory components. However, since each data type usually has a distinct marginal distribution, joint analysis of these changes of associations using multi-omics data is statistically challenging. In this paper, we propose a flexible copula-based framework to model covariate-dependent correlation structures independent of their marginals. In addition, the proposed approach could jointly combine a wide variety of univariate marginal distributions, either discrete or continuous, including the class of zero-inflated distributions. The performance of the proposed framework is demonstrated through a series of simulation studies. Finally, it is applied to a set of experimental data to investigate the dynamic relationship between single-cell RNA sequencing, chromatin accessibility, and DNA methylation at different germ layers during mouse gastrulation.
随着技术的进步,可以在单细胞水平上对多组学数据进行分析,整合多组学数据分析越来越受欢迎。在单细胞实验中,通常会联合收集甲基化变化、染色质可及性和基因表达等信息。在生物医学研究中,研究各种数据类型之间的关联并检查这些关联如何根据细胞类型和基因调控成分等其他因素发生变化通常是很有意义的。然而,由于每种数据类型通常具有不同的边缘分布,因此使用多组学数据联合分析这些关联的变化在统计学上具有挑战性。在本文中,我们提出了一个灵活的基于 copula 的框架,用于独立于边缘建模协变量相关的相关结构。此外,所提出的方法可以联合组合各种单变量边缘分布,包括离散的或连续的,包括零膨胀分布的类。通过一系列模拟研究证明了所提出框架的性能。最后,它应用于一组实验数据,以研究在小鼠原肠胚形成过程中不同胚层的单细胞 RNA 测序、染色质可及性和 DNA 甲基化之间的动态关系。