Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China.
Department of Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China.
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad343.
Single-cell multiomics techniques have been widely applied to detect the key signature of cells. These methods have achieved a single-molecule resolution and can even reveal spatial localization. These emerging methods provide insights elucidating the features of genomic, epigenomic and transcriptomic heterogeneity in individual cells. However, they have given rise to new computational challenges in data processing. Here, we describe Single-cell Single-molecule multiple Omics Pipeline (ScSmOP), a universal pipeline for barcode-indexed single-cell single-molecule multiomics data analysis. Essentially, the C language is utilized in ScSmOP to set up spaced-seed hash table-based algorithms for barcode identification according to ligation-based barcoding data and synthesis-based barcoding data, followed by data mapping and deconvolution. We demonstrate high reproducibility of data processing between ScSmOP and published pipelines in comprehensive analyses of single-cell omics data (scRNA-seq, scATAC-seq, scARC-seq), single-molecule chromatin interaction data (ChIA-Drop, SPRITE, RD-SPRITE), single-cell single-molecule chromatin interaction data (scSPRITE) and spatial transcriptomic data from various cell types and species. Additionally, ScSmOP shows more rapid performance and is a versatile, efficient, easy-to-use and robust pipeline for single-cell single-molecule multiomics data analysis.
单细胞多组学技术已被广泛应用于检测细胞的关键特征。这些方法实现了单分子分辨率,甚至可以揭示空间定位。这些新兴方法为阐明单个细胞中基因组、表观基因组和转录组异质性的特征提供了线索。然而,它们在数据处理方面带来了新的计算挑战。在这里,我们描述了单细胞单分子多组学分析的通用管道 ScSmOP,即 ScSmOP,它是一个用于基于索引的单细胞单分子多组学数据分析的通用管道。本质上,ScSmOP 利用 C 语言根据基于连接的条形码数据和基于合成的条形码数据,为条形码识别建立基于间隔种子哈希表的算法,然后进行数据映射和解卷积。我们在对单细胞多组学数据(scRNA-seq、scATAC-seq、scARC-seq)、单分子染色质相互作用数据(ChIA-Drop、SPRITE、RD-SPRITE)、单细胞单分子染色质相互作用数据(scSPRITE)和来自不同细胞类型和物种的空间转录组数据的综合分析中,展示了 ScSmOP 在数据处理方面的高重复性,与已发表的管道之间的高度一致性。此外,ScSmOP 还具有更快的性能,是一个通用、高效、易用和强大的单细胞单分子多组学数据分析管道。