Department of Bioengineering, Stanford University, Stanford, United States.
Department of Applied Physics, Stanford University, Stanford, United States.
Elife. 2019 Sep 16;8:e48994. doi: 10.7554/eLife.48994.
Single-cell RNA sequencing has spurred the development of computational methods that enable researchers to classify cell types, delineate developmental trajectories, and measure molecular responses to external perturbations. Many of these technologies rely on their ability to detect genes whose cell-to-cell variations arise from the biological processes of interest rather than transcriptional or technical noise. However, for datasets in which the biologically relevant differences between cells are subtle, identifying these genes is challenging. We present the self-assembling manifold (SAM) algorithm, an iterative soft feature selection strategy to quantify gene relevance and improve dimensionality reduction. We demonstrate its advantages over other state-of-the-art methods with experimental validation in identifying novel stem cell populations of , a prevalent parasite that infects hundreds of millions of people. Extending our analysis to a total of 56 datasets, we show that SAM is generalizable and consistently outperforms other methods in a variety of biological and quantitative benchmarks.
单细胞 RNA 测序技术推动了计算方法的发展,使研究人员能够对细胞类型进行分类,描绘发育轨迹,并测量分子对外界干扰的反应。其中许多技术都依赖于它们检测基因的能力,这些基因的细胞间变异是由感兴趣的生物过程引起的,而不是转录或技术噪声。然而,对于细胞间生物学差异细微的数据集,识别这些基因具有挑战性。我们提出了自组装流形(SAM)算法,这是一种迭代软特征选择策略,用于量化基因相关性并改进降维。我们通过在一种常见寄生虫 的新型干细胞群的实验验证中证明了其优于其他最先进方法的优势,该寄生虫感染了数亿人。将我们的分析扩展到总共 56 个数据集,我们表明 SAM 具有通用性,并且在各种生物学和定量基准测试中始终优于其他方法。