Suppr超能文献

利用深度生成模型学习罕见细胞类型的判别和结构样本。

Learning discriminative and structural samples for rare cell types with deep generative model.

机构信息

School of Computer Science and Technology, Xidian University, Xi'an, 710071, China.

出版信息

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac317.

Abstract

Cell types (subpopulations) serve as bio-markers for the diagnosis and therapy of complex diseases, and single-cell RNA-sequencing (scRNA-seq) measures expression of genes at cell level, paving the way for the identification of cell types. Although great efforts have been devoted to this issue, it remains challenging to identify rare cell types in scRNA-seq data because of the few-shot problem, lack of interpretability and separation of generating samples and clustering of cells. To attack these issues, a novel deep generative model for leveraging the small samples of cells (aka scLDS2) is proposed by precisely estimating the distribution of different cells, which discriminate the rare and non-rare cell types with adversarial learning. Specifically, to enhance interpretability of samples, scLDS2 generates the sparse faked samples of cells with $\ell _1$-norm, where the relations among cells are learned, facilitating the identification of cell types. Furthermore, scLDS2 directly obtains cell types from the generated samples by learning the block structure such that cells belonging to the same types are similar to each other with the nuclear-norm. scLDS2 joins the generation of samples, classification of the generated and truth samples for cells and feature extraction into a unified generative framework, which transforms the rare cell types detection problem into a classification problem, paving the way for the identification of cell types with joint learning. The experimental results on 20 datasets demonstrate that scLDS2 significantly outperforms 17 state-of-the-art methods in terms of various measurements with 25.12% improvement in adjusted rand index on average, providing an effective strategy for scRNA-seq data with rare cell types. (The software is coded using python, and is freely available for academic https://github.com/xkmaxidian/scLDS2).

摘要

细胞类型(亚群)可作为复杂疾病诊断和治疗的生物标志物,单细胞 RNA 测序(scRNA-seq)可测量细胞水平的基因表达,为鉴定细胞类型铺平道路。尽管已经付出了巨大的努力,但由于样本数量少、缺乏可解释性以及样本生成和细胞聚类的分离,仍然难以从 scRNA-seq 数据中鉴定稀有细胞类型。为了解决这些问题,我们提出了一种新的深度生成模型,即利用细胞小样本(scLDS2)的深度生成模型,通过精确估计不同细胞的分布,利用对抗学习来区分稀有和非稀有细胞类型。具体来说,为了增强样本的可解释性,scLDS2 通过生成具有 L1 范数的稀疏伪造细胞样本,学习细胞之间的关系,从而有助于识别细胞类型。此外,scLDS2 通过学习块结构,直接从生成的样本中获得细胞类型,使得属于同一类型的细胞彼此相似,核范数也相似。scLDS2 将样本生成、生成样本和真实样本的分类以及细胞特征提取纳入一个统一的生成框架,将稀有细胞类型检测问题转化为分类问题,为联合学习识别细胞类型铺平了道路。在 20 个数据集上的实验结果表明,scLDS2 在各种指标上都明显优于 17 种最先进的方法,平均调整兰德指数提高了 25.12%,为具有稀有细胞类型的 scRNA-seq 数据提供了一种有效的策略。(该软件使用 Python 编写,可在学术上免费使用:https://github.com/xkmaxidian/scLDS2)。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验