Suppr超能文献

基于子空间和加权距离的小 scRNA-seq 数据聚类方法。

A clustering method for small scRNA-seq data based on subspace and weighted distance.

机构信息

Hunan Engineering & Technology Research Centre for Agricultural Big Data Analysis & Decision-Making, Hunan Agricultural University, Changsha, Hunan, China.

Hunan Agricultural University, College of Information and Intelligence, Changsha, Hunan, China.

出版信息

PeerJ. 2023 Jan 23;11:e14706. doi: 10.7717/peerj.14706. eCollection 2023.

Abstract

BACKGROUND

Identifying the cell types using unsupervised methods is essential for scRNA-seq research. However, conventional similarity measures introduce challenges to single-cell data clustering because of the high dimensional, high noise, and high dropout.

METHODS

We proposed a clustering method for small cRNA-seq data based on ubspace and eighted istance (SSWD), which follows the assumption that the sets of gene subspace composed of similar density-distributing genes can better distinguish cell groups. To accurately capture the intrinsic relationship among cells or genes, a new distance metric that combines Euclidean and Pearson distance through a weighting strategy was proposed. The relative Calinski-Harabasz (CH) index was used to estimate the cluster numbers instead of the CH index because it is comparable across degrees of freedom.

RESULTS

We compared SSWD with seven prevailing methods on eight publicly scRNA-seq datasets. The experimental results show that the SSWD has better clustering accuracy and the partitioning ability of cell groups. SSWD can be downloaded at https://github.com/ningzilan/SSWD.

摘要

背景

使用无监督方法识别细胞类型对于 scRNA-seq 研究至关重要。然而,由于高维、高噪声和高缺失,传统的相似性度量方法给单细胞数据聚类带来了挑战。

方法

我们提出了一种基于子空间和加权距离(SSWD)的小型 cRNA-seq 数据聚类方法,该方法遵循这样的假设,即由相似密度分布基因组成的基因子空间集可以更好地区分细胞群。为了准确捕捉细胞或基因之间的内在关系,我们通过加权策略提出了一种新的距离度量,它结合了欧几里得和皮尔逊距离。我们使用相对 Calinski-Harabasz(CH)指数来估计聚类数量,而不是 CH 指数,因为它在自由度方面具有可比性。

结果

我们在八个公开的 scRNA-seq 数据集上比较了 SSWD 与七种流行的方法。实验结果表明,SSWD 具有更好的聚类准确性和细胞群的划分能力。SSWD 可以在 https://github.com/ningzilan/SSWD 上下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec0/9879162/92249676dced/peerj-11-14706-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验