Zhong Yan, Cui Siwei, Yang Yongjian, Cai James J
School of Statistics, KLATASDS-MOE, East China Normal University, Shanghai, China.
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.
Bioinformatics. 2024 Jul 17;40(7). doi: 10.1093/bioinformatics/btae457.
Understanding single-cell expression variability (scEV) or gene expression noise among cells of the same type and state is crucial for delineating population-level cellular function. While epigenetic mechanisms are widely implicated in gene expression regulation, a definitive link between chromatin accessibility and scEV remains elusive. Recent advances in single-cell techniques enable the study of single-cell multiomics data that include the simultaneous measurement of scATAC-seq and scRNA-seq within individual cells, presenting an unprecedented opportunity to address this gap.
This paper introduces an innovative testing pipeline to investigate the association between chromatin accessibility and scEV. With single-cell multiomics data of scATAC-seq and scRNA-seq, the pipeline hinges on comparing the prediction performance of scATAC-seq data on gene expression levels between highly variable genes (HVGs) and non-highly variable genes (non-HVGs). Applying this pipeline to paired scATAC-seq and scRNA-seq data from human hematopoietic stem and progenitor cells, we observed a significantly superior prediction performance of scATAC-seq data for HVGs compared to non-HVGs. Notably, there was substantial overlap between well-predicted genes and HVGs. The gene pathways enriched from well-predicted genes are highly pertinent to cell type-specific functions. Our findings support the notion that scEV largely stems from cell-to-cell variability in chromatin accessibility, providing compelling evidence for the epigenetic regulation of scEV and offering promising avenues for investigating gene regulation mechanisms at the single-cell level.
The source code and data used in this paper can be found at https://github.com/SiweiCui/EpigeneticControlOfSingle-CellExpressionVariability.
Supplementary data are available at Bioinformatics online.
了解同一类型和状态的细胞之间的单细胞表达变异性(scEV)或基因表达噪声对于描绘群体水平的细胞功能至关重要。虽然表观遗传机制广泛参与基因表达调控,但染色质可及性与scEV之间的确切联系仍不清楚。单细胞技术的最新进展使得能够研究单细胞多组学数据,包括在单个细胞内同时测量scATAC-seq和scRNA-seq,为填补这一空白提供了前所未有的机会。
本文介绍了一种创新的测试流程,以研究染色质可及性与scEV之间的关联。利用scATAC-seq和scRNA-seq的单细胞多组学数据,该流程依赖于比较scATAC-seq数据在高变异性基因(HVG)和非高变异性基因(非HVG)之间对基因表达水平的预测性能。将此流程应用于来自人类造血干细胞和祖细胞的配对scATAC-seq和scRNA-seq数据,我们观察到与非HVG相比,scATAC-seq数据对HVG的预测性能显著更优。值得注意的是,预测良好的基因与HVG之间存在大量重叠。从预测良好的基因中富集的基因通路与细胞类型特异性功能高度相关。我们的研究结果支持scEV很大程度上源于染色质可及性的细胞间变异性这一观点,为scEV的表观遗传调控提供了有力证据,并为在单细胞水平研究基因调控机制提供了有前景的途径。
本文中使用的源代码和数据可在https://github.com/SiweiCui/EpigeneticControlOfSingle-CellExpressionVariability上找到。
补充数据可在《生物信息学》在线版上获取。