Department of Genetics, Yale University, New Haven, CT, USA.
Computational Biology & Bioinformatics Program, Yale University, New Haven, CT, USA.
Nat Biotechnol. 2021 May;39(5):619-629. doi: 10.1038/s41587-020-00803-5. Epub 2021 Feb 8.
Current methods for comparing single-cell RNA sequencing datasets collected in multiple conditions focus on discrete regions of the transcriptional state space, such as clusters of cells. Here we quantify the effects of perturbations at the single-cell level using a continuous measure of the effect of a perturbation across the transcriptomic space. We describe this space as a manifold and develop a relative likelihood estimate of observing each cell in each of the experimental conditions using graph signal processing. This likelihood estimate can be used to identify cell populations specifically affected by a perturbation. We also develop vertex frequency clustering to extract populations of affected cells at the level of granularity that matches the perturbation response. The accuracy of our algorithm at identifying clusters of cells that are enriched or depleted in each condition is, on average, 57% higher than the next-best-performing algorithm tested. Gene signatures derived from these clusters are more accurate than those of six alternative algorithms in ground truth comparisons.
目前用于比较在多种条件下收集的单细胞 RNA 测序数据集的方法主要集中在转录状态空间的离散区域,例如细胞簇。在这里,我们使用跨转录组空间的扰动效应的连续度量来量化单细胞水平的扰动效应。我们将这个空间描述为一个流形,并使用图信号处理来开发在每个实验条件下观察每个细胞的相对似然估计。这个似然估计可以用来识别受到扰动影响的细胞群体。我们还开发了顶点频率聚类,以提取与扰动响应相匹配的受影响细胞群体。在识别每个条件下富集或耗尽的细胞簇方面,我们算法的准确性平均比测试的下一个表现最好的算法高出 57%。与六种替代算法相比,这些聚类衍生的基因特征在真实比较中更准确。