Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA.
Boston Children's Hospital, Boston, MA, 02115, USA.
BMC Bioinformatics. 2020 Apr 25;21(1):158. doi: 10.1186/s12859-020-3482-1.
With the rapid development of single-cell RNA sequencing technology, it is possible to dissect cell-type composition at high resolution. A number of methods have been developed with the purpose to identify rare cell types. However, existing methods are still not scalable to large datasets, limiting their utility. To overcome this limitation, we present a new software package, called GiniClust3, which is an extension of GiniClust2 and significantly faster and memory-efficient than previous versions.
Using GiniClust3, it only takes about 7 h to identify both common and rare cell clusters from a dataset that contains more than one million cells. Cell type mapping and perturbation analyses show that GiniClust3 could robustly identify cell clusters.
Taken together, these results suggest that GiniClust3 is a powerful tool to identify both common and rare cell population and can handle large dataset. GiniCluster3 is implemented in the open-source python package and available at https://github.com/rdong08/GiniClust3.
随着单细胞 RNA 测序技术的快速发展,我们可以高分辨率地剖析细胞类型组成。已经开发了许多方法来鉴定稀有细胞类型。然而,现有的方法仍然不能扩展到大数据集,限制了它们的应用。为了克服这一限制,我们提出了一个新的软件包,称为 GiniClust3,它是 GiniClust2 的扩展,比以前的版本更快、更节省内存。
使用 GiniClust3,仅需约 7 小时即可从包含超过一百万细胞的数据集识别常见和稀有细胞簇。细胞类型映射和扰动分析表明,GiniClust3 可以稳健地识别细胞簇。
总的来说,这些结果表明 GiniClust3 是一种强大的工具,可以识别常见和稀有细胞群体,并且可以处理大型数据集。GiniCluster3 是在开源 python 包中实现的,并可在 https://github.com/rdong08/GiniClust3 上获得。