Yu Zhenhua, Du Fang, Song Lijuan
School of Information Engineering, Ningxia University, Yinchuan, China.
Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan, China.
Front Genet. 2022 Jan 27;13:823941. doi: 10.3389/fgene.2022.823941. eCollection 2022.
Single-cell DNA sequencing (scDNA-seq) enables high-resolution profiling of genetic diversity among single cells and is especially useful for deciphering the intra-tumor heterogeneity and evolutionary history of tumor. Specific technical issues such as allele dropout, false-positive errors, and doublets make scDNA-seq data incomplete and error-prone, giving rise to a severe challenge of accurately inferring clonal architecture of tumor. To effectively address these issues, we introduce a new computational method called SCClone for reasoning subclones from single nucleotide variation (SNV) data of single cells. Specifically, SCClone leverages a probability mixture model for binary data to cluster single cells into distinct subclones. To accurately decipher underlying clonal composition, a novel model selection scheme based on inter-cluster variance is employed to find the optimal number of subclones. Extensive evaluations on various simulated datasets suggest SCClone has strong robustness against different technical noises in scDNA-seq data and achieves better performance than the state-of-the-art methods in reasoning clonal composition. Further evaluations of SCClone on three real scDNA-seq datasets show that it can effectively find the underlying subclones from severely disturbed data. The SCClone software is freely available at https://github.com/qasimyu/scclone.
单细胞DNA测序(scDNA-seq)能够对单细胞之间的遗传多样性进行高分辨率分析,尤其有助于解析肿瘤的肿瘤内异质性和进化史。诸如等位基因脱扣、假阳性错误和双峰等特定技术问题使得scDNA-seq数据不完整且容易出错,给准确推断肿瘤的克隆结构带来了严峻挑战。为了有效解决这些问题,我们引入了一种名为SCClone的新计算方法,用于从单细胞的单核苷酸变异(SNV)数据中推断亚克隆。具体而言,SCClone利用一种针对二元数据的概率混合模型将单细胞聚类为不同的亚克隆。为了准确解析潜在的克隆组成,采用了一种基于簇间方差的新型模型选择方案来找到亚克隆的最佳数量。对各种模拟数据集的广泛评估表明,SCClone对scDNA-seq数据中的不同技术噪声具有很强的鲁棒性,并且在推断克隆组成方面比现有方法具有更好的性能。对三个真实scDNA-seq数据集的进一步评估表明,它可以从严重干扰的数据中有效地找到潜在的亚克隆。SCClone软件可在https://github.com/qasimyu/scclone上免费获取。