基于共享最近邻和图划分的单细胞聚类。

Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning.

机构信息

Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.

School of Computer Science and Engineering, Yulin Normal University, Yulin, 537000, Guangxi, China.

出版信息

Interdiscip Sci. 2020 Jun;12(2):117-130. doi: 10.1007/s12539-019-00357-4. Epub 2020 Feb 22.

DOI:10.1007/s12539-019-00357-4

PMID:32086753

Abstract

Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.

摘要

单细胞 RNA 测序 (scRNA-seq) 数据的聚类可以发现细胞亚型，有助于理解和分析疾病的进程。确定边的权重是基于图的聚类方法的一个重要组成部分。尽管已经提出了几种用于 scRNA-seq 数据的基于图的聚类算法，但它们通常基于 k-最近邻 (KNN) 和共享最近邻 (SNN)，而不考虑图的结构信息。在这里，为了提高聚类准确性，我们提出了一种用于单细胞聚类的新方法，称为结构共享最近邻-Louvain (SSNN-Louvain)，它整合了图的结构信息和模块检测。在 SSNN-Louvain 中，基于节点与其共享最近邻居之间的距离，通过引入共享最近邻居的数量与最近邻居的数量的比例来定义边的权重，从而整合了图的结构信息。然后，提出了一种改进的 Louvain 社区检测算法，并将其应用于识别图中的模块。本质上，每个社区代表一种细胞亚型。值得一提的是，我们提出的方法集成了 SNN 图和社区检测的优点，而无需调整除邻居数量以外的任何其他参数。为了测试 SSNN-Louvain 的性能，我们在 16 个真实数据集上与五种现有方法进行了比较，包括非负矩阵分解、基于多内核学习的单细胞解释、SNN-Cliq、Seurat 和 PhenoGraph。实验结果表明，在这些数据集上，我们的方法在平均性能上表现最佳。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于共享最近邻和图划分的单细胞聚类。

Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning.

机构信息

出版信息

相似文献

引用本文的文献

基于共享最近邻和图划分的单细胞聚类。

Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning.

机构信息

出版信息

相似文献

引用本文的文献