Suppr超能文献

深度生成解码器:表示的 MAP 估计可改进单细胞 RNA 数据的建模。

The Deep Generative Decoder: MAP estimation of representations improves modelling of single-cell RNA data.

机构信息

Center for Health Data Science, University of Copenhagen, 2200 Copenhagen, Denmark.

Department of Computer Science, University of Copenhagen, 2100 Copenhagen, Denmark.

出版信息

Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad497.

Abstract

MOTIVATION

Learning low-dimensional representations of single-cell transcriptomics has become instrumental to its downstream analysis. The state of the art is currently represented by neural network models, such as variational autoencoders, which use a variational approximation of the likelihood for inference.

RESULTS

We here present the Deep Generative Decoder (DGD), a simple generative model that computes model parameters and representations directly via maximum a posteriori estimation. The DGD handles complex parameterized latent distributions naturally unlike variational autoencoders, which typically use a fixed Gaussian distribution, because of the complexity of adding other types. We first show its general functionality on a commonly used benchmark set, Fashion-MNIST. Secondly, we apply the model to multiple single-cell datasets. Here, the DGD learns low-dimensional, meaningful, and well-structured latent representations with sub-clustering beyond the provided labels. The advantages of this approach are its simplicity and its capability to provide representations of much smaller dimensionality than a comparable variational autoencoder.

AVAILABILITY AND IMPLEMENTATION

scDGD is available as a python package at https://github.com/Center-for-Health-Data-Science/scDGD. The remaining code is made available here: https://github.com/Center-for-Health-Data-Science/dgd.

摘要

动机

学习单细胞转录组学的低维表示已经成为其下游分析的重要手段。目前的技术水平代表是神经网络模型,如变分自动编码器,它使用似然的变分逼近进行推理。

结果

我们在这里提出了深度生成解码器(DGD),这是一种简单的生成模型,它通过最大后验估计直接计算模型参数和表示。与变分自动编码器不同,DGD 可以处理复杂的参数化潜在分布,因为添加其他类型的分布通常很复杂。我们首先在常用的基准集 Fashion-MNIST 上展示了它的一般功能。其次,我们将模型应用于多个单细胞数据集。在这里,DGD 学习到了低维的、有意义的、结构良好的潜在表示,并在提供的标签之外进行了子聚类。这种方法的优点是它的简单性,以及它能够提供比可比变分自动编码器小得多的维度的表示。

可用性和实现

scDGD 可作为 python 包在 https://github.com/Center-for-Health-Data-Science/scDGD 获得。其余代码可在此处获得:https://github.com/Center-for-Health-Data-Science/dgd。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/54ad/10483129/f9a6cf5a53a4/btad497f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验