Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.
Cell Rep Methods. 2022 Jan 24;2(1). doi: 10.1016/j.crmeth.2021.100135. Epub 2022 Jan 4.
Visualizing low-dimensional representations with scatterplots is a crucial step in analyzing single-cell genomic data. However, this visualization has significant biases. The first bias arises when visualizing the gene expression levels or the cell identities. The scatterplot only shows a subset of cells plotted last, and the cells plotted earlier are masked and unseen. The second bias arises when comparing the cell-type compositions across samples. The scatterplot is biased by the unbalanced total number of cells across samples. We developed SCUBI, an unbiased method that visualizes the aggregated information of cells within non-overlapping squares to address the first bias and visualizes the differences of cell proportions across samples to address the second bias. We show that SCUBI presents a more faithful visual representation of the information in a real single-cell RNA sequencing (RNA-seq) dataset and has the potential to change how low-dimensional representations are visualized in single-cell genomic data.
使用散点图可视化低维表示是分析单细胞基因组数据的关键步骤。然而,这种可视化存在显著的偏差。第一个偏差出现在可视化基因表达水平或细胞身份时。散点图仅显示了最后绘制的部分细胞,而更早绘制的细胞被屏蔽且不可见。第二个偏差出现在比较样本间的细胞类型组成时。散点图受到样本间细胞总数不平衡的影响。我们开发了 SCUBI,这是一种无偏差的方法,它可以可视化非重叠正方形内细胞的聚合信息,以解决第一个偏差,并可视化样本间细胞比例的差异,以解决第二个偏差。我们表明,SCUBI 更真实地呈现了真实单细胞 RNA 测序(RNA-seq)数据集的信息,有可能改变单细胞基因组数据中低维表示的可视化方式。