Huang Penghui, Cai Manqi, Lu Xinghua, McKennan Chris, Wang Jiebiao
Deparment of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA.
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
bioRxiv. 2023 Mar 16:2023.03.15.532820. doi: 10.1101/2023.03.15.532820.
Bulk transcriptomics in tissue samples reflects the average expression levels across different cell types and is highly influenced by cellular fractions. As such, it is critical to estimate cellular fractions to both deconfound differential expression analyses and infer cell type-specific differential expression. Since experimentally counting cells is infeasible in most tissues and studies, cellular deconvolution methods have been developed as an alternative. However, existing methods are designed for tissues consisting of clearly distinguishable cell types and have difficulties estimating highly correlated or rare cell types. To address this challenge, we propose Hierarchical Deconvolution (HiDecon) that uses single-cell RNA sequencing references and a hierarchical cell type tree, which models the similarities among cell types and cell differentiation relationships, to estimate cellular fractions in bulk data. By coordinating cell fractions across layers of the hierarchical tree, cellular fraction information is passed up and down the tree, which helps correct estimation biases by pooling information across related cell types. The flexible hierarchical tree structure also enables estimating rare cell fractions by splitting the tree to higher resolutions. Through simulations and real data applications with the ground truth of measured cellular fractions, we demonstrate that HiDecon significantly outperforms existing methods and accurately estimates cellular fractions.
组织样本中的批量转录组学反映了不同细胞类型的平均表达水平,并受到细胞比例的高度影响。因此,估计细胞比例对于消除差异表达分析的混淆以及推断细胞类型特异性差异表达至关重要。由于在大多数组织和研究中通过实验计数细胞是不可行的,因此已经开发了细胞反卷积方法作为替代方法。然而,现有方法是为具有明显可区分细胞类型的组织设计的,在估计高度相关或罕见的细胞类型时存在困难。为了应对这一挑战,我们提出了分层反卷积(HiDecon)方法,该方法使用单细胞RNA测序参考和分层细胞类型树,该树对细胞类型之间的相似性和细胞分化关系进行建模,以估计批量数据中的细胞比例。通过协调分层树各层之间的细胞比例,细胞比例信息在树中上下传递,这有助于通过汇总相关细胞类型的信息来纠正估计偏差。灵活的分层树结构还能够通过将树分割为更高分辨率来估计罕见细胞比例。通过基于测量的细胞比例的真实情况进行模拟和实际数据应用,我们证明HiDecon明显优于现有方法,并能准确估计细胞比例。