Akimov Yevhen, Bulanova Daria, Timonen Sanna, Wennerberg Krister, Aittokallio Tero
Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland.
Biotech Research and Innovation Centre (BRIC) and Novo Nordisk Foundation Center for Stem Cell Biology (DanStem), University of Copenhagen, Copenhagen, Denmark.
Mol Syst Biol. 2020 Mar;16(3):e9195. doi: 10.15252/msb.20199195.
Cellular DNA barcoding has become a popular approach to study heterogeneity of cell populations and to identify clones with differential response to cellular stimuli. However, there is a lack of reliable methods for statistical inference of differentially responding clones. Here, we used mixtures of DNA-barcoded cell pools to generate a realistic benchmark read count dataset for modelling a range of outcomes of clone-tracing experiments. By accounting for the statistical properties intrinsic to the DNA barcode read count data, we implemented an improved algorithm that results in a significantly lower false-positive rate, compared to current RNA-seq data analysis algorithms, especially when detecting differentially responding clones in experiments with strong selection pressure. Building on the reliable statistical methodology, we illustrate how multidimensional phenotypic profiling enables one to deconvolute phenotypically distinct clonal subpopulations within a cancer cell line. The mixture control dataset and our analysis results provide a foundation for benchmarking and improving algorithms for clone-tracing experiments.
细胞DNA条形码技术已成为研究细胞群体异质性以及鉴定对细胞刺激有不同反应的克隆的一种常用方法。然而,目前缺乏用于对差异反应克隆进行统计推断的可靠方法。在此,我们使用DNA条形码细胞池混合物生成了一个逼真的基准读数计数数据集,用于对一系列克隆追踪实验的结果进行建模。通过考虑DNA条形码读数计数数据固有的统计特性,我们实现了一种改进算法,与当前的RNA测序数据分析算法相比,该算法的假阳性率显著降低,尤其是在检测具有强选择压力的实验中的差异反应克隆时。基于可靠的统计方法,我们展示了多维表型分析如何使人们能够在癌细胞系中解卷积表型不同的克隆亚群。混合控制数据集和我们的分析结果为克隆追踪实验的算法基准测试和改进提供了基础。