Suppr超能文献

HGC:适用于大规模单细胞数据的快速层次聚类。

HGC: fast hierarchical clustering for large-scale single-cell data.

机构信息

MOE Key Laboratory of Bioinformatics, Division of Bioinformatics, BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China.

School of Life Sciences, Tsinghua University, Beijing 100084, China.

出版信息

Bioinformatics. 2021 Nov 5;37(21):3964-3965. doi: 10.1093/bioinformatics/btab420.

Abstract

SUMMARY

Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets.

AVAILABILITY AND IMPLEMENTATION

The R package of HGC is available at https://bioconductor.org/packages/HGC/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

聚类是揭示单细胞数据异质性的关键步骤。大多数现有的单细胞聚类方法输出固定数量的聚类,而没有层次信息。经典的层次聚类 (HC) 提供了细胞的层次图,但由于计算复杂度高,无法扩展到大型数据集。我们提出了 HGC,这是一种快速基于图的层次聚类工具,可以解决这两个问题。它结合了基于图的聚类和 HC 的优点。在细胞的共享最近邻图上,HGC 以线性时间复杂度构建层次树。实验表明,HGC 能够对数据底层的生物学层次进行多分辨率探索,在基准数据上达到了最先进的准确性,并且可以扩展到大型数据集。

可用性和实现

HGC 的 R 包可在 https://bioconductor.org/packages/HGC/ 获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验