Wu Hao, Mao Disheng, Zhang Yuping, Chi Zhiyi, Stitzel Michael, Ouyang Zhengqing
Department of Statistics, University of Connecticut, 215 Glenbrook Rd., Storrs, CT 06269, USA.
The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA.
NAR Genom Bioinform. 2021 Jan 12;3(1):lqaa087. doi: 10.1093/nargab/lqaa087. eCollection 2021 Mar.
Traditional bulk RNA-sequencing of human pancreatic islets mainly reflects transcriptional response of major cell types. Single-cell RNA sequencing technology enables transcriptional characterization of individual cells, and thus makes it possible to detect cell types and subtypes. To tackle the heterogeneity of single-cell RNA-seq data, powerful and appropriate clustering is required to facilitate the discovery of cell types. In this paper, we propose a new clustering framework based on a graph-based model with various types of dissimilarity measures. We take the compositional nature of single-cell RNA-seq data into account and employ log-ratio transformations. The practical merit of the proposed method is demonstrated through the application to the centered log-ratio-transformed single-cell RNA-seq data for human pancreatic islets. The practical merit is also demonstrated through comparisons with existing single-cell clustering methods. The R-package for the proposed method can be found at https://github.com/Zhang-Data-Science-Research-Lab/LrSClust.
人类胰岛的传统批量RNA测序主要反映主要细胞类型的转录反应。单细胞RNA测序技术能够对单个细胞进行转录特征分析,从而使得检测细胞类型和亚型成为可能。为了解决单细胞RNA测序数据的异质性问题,需要强大且合适的聚类方法来促进细胞类型的发现。在本文中,我们提出了一种基于图模型并带有各种差异度量的新聚类框架。我们考虑了单细胞RNA测序数据的组成性质,并采用对数比率变换。通过将该方法应用于人类胰岛的中心对数比率变换后的单细胞RNA测序数据,证明了所提方法的实际优点。通过与现有单细胞聚类方法的比较,也证明了该实际优点。所提方法的R包可在https://github.com/Zhang-Data-Science-Research-Lab/LrSClust找到。