European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.
Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK.
Genome Biol. 2021 Dec 6;22(1):333. doi: 10.1186/s13059-021-02548-z.
scRNA-seq datasets are increasingly used to identify gene panels that can be probed using alternative technologies, such as spatial transcriptomics, where choosing the best subset of genes is vital. Existing methods are limited by a reliance on pre-existing cell type labels or by difficulties in identifying markers of rare cells. We introduce an iterative approach, geneBasis, for selecting an optimal gene panel, where each newly added gene captures the maximum distance between the true manifold and the manifold constructed using the currently selected gene panel. Our approach outperforms existing strategies and can resolve cell types and subtle cell state differences.
单细胞 RNA 测序 (scRNA-seq) 数据集越来越多地被用于鉴定可使用替代技术(如空间转录组学)探测的基因面板,选择最佳的基因子集至关重要。现有方法受到对预先存在的细胞类型标签的依赖或识别稀有细胞标记物的困难的限制。我们引入了一种迭代方法 geneBasis,用于选择最佳基因面板,其中每个新添加的基因捕获真实流形和使用当前选择的基因面板构建的流形之间的最大距离。我们的方法优于现有策略,可以解决细胞类型和细微细胞状态差异的问题。