Suppr超能文献

通过HSS-LDA进行监督降维以探索单细胞数据

Supervised dimensionality reduction for exploration of single-cell data by HSS-LDA.

作者信息

Amouzgar Meelad, Glass David R, Baskar Reema, Averbukh Inna, Kimmey Samuel C, Tsai Albert G, Hartmann Felix J, Bendall Sean C

机构信息

Department of Pathology, Stanford University, Stanford, CA, USA.

Immunology Graduate Program, Stanford University, Stanford, CA, USA.

出版信息

Patterns (N Y). 2022 Jun 24;3(8):100536. doi: 10.1016/j.patter.2022.100536. eCollection 2022 Aug 12.

Abstract

Single-cell technologies generate large, high-dimensional datasets encompassing a diversity of omics. Dimensionality reduction captures the structure and heterogeneity of the original dataset, creating low-dimensional visualizations that contribute to the human understanding of data. Existing algorithms are typically unsupervised, using measured features to generate manifolds, disregarding known biological labels such as cell type or experimental time point. We repurpose the classification algorithm, linear discriminant analysis (LDA), for supervised dimensionality reduction of single-cell data. LDA identifies linear combinations of predictors that optimally separate classes, enabling the study of specific aspects of cellular heterogeneity. We implement feature selection by hybrid subset selection (HSS) and demonstrate that this computationally efficient approach generates non-stochastic, interpretable axes amenable to diverse biological processes such as differentiation over time and cell cycle. We benchmark HSS-LDA against several popular dimensionality-reduction algorithms and illustrate its utility and versatility for the exploration of single-cell mass cytometry, transcriptomics, and chromatin accessibility data.

摘要

单细胞技术生成了包含多种组学的大型高维数据集。降维捕捉原始数据集的结构和异质性,创建有助于人类理解数据的低维可视化。现有算法通常是无监督的,利用测量特征生成流形,而忽略了诸如细胞类型或实验时间点等已知的生物学标签。我们将分类算法线性判别分析(LDA)重新用于单细胞数据的监督降维。LDA识别能最佳分离类别的预测变量的线性组合,从而能够研究细胞异质性的特定方面。我们通过混合子集选择(HSS)实现特征选择,并证明这种计算效率高的方法能生成适用于多种生物学过程(如随时间的分化和细胞周期)的非随机、可解释的轴。我们将HSS-LDA与几种流行的降维算法进行基准测试,并说明其在探索单细胞质谱流式细胞术、转录组学和染色质可及性数据方面的实用性和通用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0002/9403402/1562dcbb63f6/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验