Suppr超能文献

基于模型的深度学习嵌入方法用于单细胞 RNA-seq 数据的约束聚类分析。

Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data.

机构信息

Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.

Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.

出版信息

Nat Commun. 2021 Mar 25;12(1):1873. doi: 10.1038/s41467-021-22008-3.

Abstract

Clustering is a critical step in single cell-based studies. Most existing methods support unsupervised clustering without the a priori exploitation of any domain knowledge. When confronted by the high dimensionality and pervasive dropout events of scRNA-Seq data, purely unsupervised clustering methods may not produce biologically interpretable clusters, which complicates cell type assignment. In such cases, the only recourse is for the user to manually and repeatedly tweak clustering parameters until acceptable clusters are found. Consequently, the path to obtaining biologically meaningful clusters can be ad hoc and laborious. Here we report a principled clustering method named scDCC, that integrates domain knowledge into the clustering step. Experiments on various scRNA-seq datasets from thousands to tens of thousands of cells show that scDCC can significantly improve clustering performance, facilitating the interpretability of clusters and downstream analyses, such as cell type assignment.

摘要

聚类是基于单细胞研究的关键步骤。大多数现有的方法支持无监督聚类,无需利用任何先验的领域知识。当面对 scRNA-Seq 数据的高维性和普遍的缺失事件时,纯粹的无监督聚类方法可能无法产生生物学上可解释的聚类,这使得细胞类型的分配变得复杂。在这种情况下,唯一的办法是让用户手动反复调整聚类参数,直到找到可接受的聚类。因此,获得生物学意义上的聚类的途径可能是特定的和繁琐的。在这里,我们报告了一种名为 scDCC 的有原则的聚类方法,它将领域知识集成到聚类步骤中。在来自数千到数万细胞的各种 scRNA-seq 数据集上的实验表明,scDCC 可以显著提高聚类性能,促进聚类的可解释性和下游分析,如细胞类型分配。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06b4/7994574/e559c3f4b54d/41467_2021_22008_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验