Sharma Rachesh, Pandey Neetesh, Mongia Aanchal, Mishra Shreya, Majumdar Angshul, Kumar Vibhor
Department of Electronic and Communication Engineering, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India.
Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India.
NAR Genom Bioinform. 2020 Nov 19;2(4):lqaa091. doi: 10.1093/nargab/lqaa091. eCollection 2020 Dec.
The advent of single-cell open-chromatin profiling technology has facilitated the analysis of heterogeneity of activity of regulatory regions at single-cell resolution. However, stochasticity and availability of low amount of relevant DNA, cause high drop-out rate and noise in single-cell open-chromatin profiles. We introduce here a robust method called as forest of imputation trees (FITs) to recover original signals from highly sparse and noisy single-cell open-chromatin profiles. FITs makes multiple imputation trees to avoid bias during the restoration of read-count matrices. It resolves the challenging issue of recovering open chromatin signals without blurring out information at genomic sites with cell-type-specific activity. Besides visualization and classification, FITs-based imputation also improved accuracy in the detection of enhancers, calculating pathway enrichment score and prediction of chromatin-interactions. FITs is generalized for wider applicability, especially for highly sparse read-count matrices. The superiority of FITs in recovering signals of minority cells also makes it highly useful for single-cell open-chromatin profile from samples. The software is freely available at https://reggenlab.github.io/FITs/.
单细胞开放染色质分析技术的出现促进了在单细胞分辨率下对调控区域活性异质性的分析。然而,随机性以及相关DNA量少的情况,导致单细胞开放染色质图谱中的高缺失率和噪声。我们在此介绍一种称为插补树森林(FITs)的稳健方法,以从高度稀疏且有噪声的单细胞开放染色质图谱中恢复原始信号。FITs构建多个插补树,以避免在读数矩阵恢复过程中出现偏差。它解决了在不模糊具有细胞类型特异性活性的基因组位点信息的情况下恢复开放染色质信号这一具有挑战性的问题。除了可视化和分类外,基于FITs的插补还提高了增强子检测、计算通路富集分数和染色质相互作用预测的准确性。FITs具有更广泛的适用性,尤其适用于高度稀疏的读数矩阵。FITs在恢复少数细胞信号方面的优越性也使其对于来自样本的单细胞开放染色质图谱非常有用。该软件可在https://reggenlab.github.io/FITs/免费获取。