Computer Science and Engineering, Tezpur University, Assam, 784028, India.
Computer Science, College of Engineering and Applied Science, University of Colorado, Colorado Springs, CO, 80933-7150, USA.
Comput Biol Med. 2021 Oct;137:104820. doi: 10.1016/j.compbiomed.2021.104820. Epub 2021 Sep 3.
scRNA-seq data analysis enables new possibilities for identification of novel cells, specific characterization of known cells and study of cell heterogeneity. The performance of most clustering methods especially developed for scRNA-seq is greatly influenced by user input. We propose a centrality-clustering method named UICPC and compare its performance with 9 state-of-the-art clustering methods on 11 real-world scRNA-seq datasets to demonstrate its effectiveness and usefulness in discovering cell groups. Our method does not require user input. However, it requires settings of threshold, which are benchmarked after performing extensive experiments. We observe that most compared approaches show poor performance due to high heterogeneity and large dataset dimensions. However, UICPC shows excellent performance in terms of NMI, Purity and ARI, respectively. UICPC is available as an R package and can be downloaded by clicking the link https://sites.google.com/view/hussinchowdhury/software.
单细胞 RNA 测序数据分析为鉴定新细胞、特定的已知细胞特征和研究细胞异质性提供了新的可能。大多数专门为单细胞 RNA 测序开发的聚类方法的性能受到用户输入的极大影响。我们提出了一种名为 UICPC 的中心性聚类方法,并将其性能与 9 种最先进的聚类方法在 11 个真实的单细胞 RNA 测序数据集上进行比较,以证明其在发现细胞群方面的有效性和实用性。我们的方法不需要用户输入。但是,它需要设置阈值,这是在进行广泛的实验后进行基准测试的。我们观察到,由于高度的异质性和大的数据集维度,大多数比较方法的性能都很差。然而,UICPC 在 NMI、纯度和 ARI 方面的表现都非常出色。UICPC 作为一个 R 包提供,可以通过点击链接 https://sites.google.com/view/hussinchowdhury/software 进行下载。