Suppr
超能文献

从单细胞 mRNA 测序数据中反卷积自动编码器以学习生物调节模块。

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data.

机构信息

Centre for Genomic Medicine Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.

Section for Cognitive Systems Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark.

出版信息

BMC Bioinformatics. 2019 Jul 8;20(1):379. doi: 10.1186/s12859-019-2952-9.

DOI:10.1186/s12859-019-2952-9

PMID:31286861

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6615267/

Abstract

BACKGROUND

Unsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction.

RESULTS

Here, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets.

CONCLUSIONS

We discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.

摘要

背景

无监督机器学习方法（深度学习）在嘈杂的单细胞 mRNA 测序数据（scRNA-seq）中显示出了它们的有用性，尽管数据存在零膨胀，但这些模型仍能很好地推广。一类神经网络，即自动编码器，已被证明可用于单细胞数据去噪、缺失值插补和降维。

结果

在这里，我们提出了一个引人注目的特征，它有可能极大地提高自动编码器的可用性：通过专门的训练，自动编码器不仅能够对数据进行概括，还能够分离出具有生物学意义的模块，我们发现这些模块被编码在网络的表示层中。我们的模型可以从 scRNA-seq 数据中勾勒出控制数据集的生物学意义模块，并提供有关每个单细胞中哪些模块处于活动状态的信息。重要的是，这些模块中的大多数都可以通过已知的生物学功能来解释，这些生物学功能是由 Hallmark 基因集提供的。

结论

我们发现，对自动编码器进行专门训练，可以在不做任何假设的情况下，对数据中固有的生物学模块进行去卷积。通过与经典途径的基因特征进行比较，我们发现这些模块是可以直接解释的。这一发现的意义重大，因为它使得有可能描绘出细胞特定效应背后的驱动因素。与其他降维方法或分类的监督模型相比，我们的方法具有处理 scRNA-seq 零膨胀性质的优势，并且通过建立输入和解码数据之间的联系，验证了模型捕获了相关信息。从长远来看，我们的模型与聚类方法相结合，能够提供有关给定单细胞所属的亚型以及哪些生物学功能决定该归属的信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4451/6615267/97d9df7fed44/12859_2019_2952_Fig1_HTML.jpg

相似文献

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data.

BMC Bioinformatics. 2019 Jul 8;20(1):379. doi: 10.1186/s12859-019-2952-9.

Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

Interpretable Autoencoders Trained on Single Cell Sequencing Data Can Transfer Directly to Data from Unseen Tissues.

Cells. 2021 Dec 28;11(1):85. doi: 10.3390/cells11010085.

scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data.

BMC Bioinformatics. 2023 May 26;24(1):217. doi: 10.1186/s12859-023-05339-4.

Single-cell RNA-seq data analysis based on directed graph neural network.

Methods. 2023 Mar;211:48-60. doi: 10.1016/j.ymeth.2023.02.008. Epub 2023 Feb 16.

Single-cell RNA sequencing data analysis utilizing multi-type graph neural networks.

Comput Biol Med. 2024 Sep;179:108921. doi: 10.1016/j.compbiomed.2024.108921. Epub 2024 Jul 25.

Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.

scGMAAE: Gaussian mixture adversarial autoencoders for diversification analysis of scRNA-seq data.

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac585.

Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics.

Pac Symp Biocomput. 2019;24:362-373.

引用本文的文献

An autoencoder learning method for predicting breast cancer subtypes.

PLoS One. 2025 Jul 23;20(7):e0327773. doi: 10.1371/journal.pone.0327773. eCollection 2025.

CSI-GEP: A GPU-based unsupervised machine learning approach for recovering gene expression programs in atlas-scale single-cell RNA-seq data.

Cell Genom. 2025 Jan 8;5(1):100739. doi: 10.1016/j.xgen.2024.100739.

Interpretable deep learning in single-cell omics.

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae374.

Model-based evaluation of spatiotemporal data reduction methods with unknown ground truth through optimal visualization and interpretability metrics.

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad455.

NCAE: data-driven representations using a deep network-coherent DNA methylation autoencoder identify robust disease and risk factor signatures.

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad293.

A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data.

BMC Bioinformatics. 2023 May 15;24(1):198. doi: 10.1186/s12859-023-05262-8.

clusterMaker2: a major update to clusterMaker, a multi-algorithm clustering app for Cytoscape.

BMC Bioinformatics. 2023 Apr 5;24(1):134. doi: 10.1186/s12859-023-05225-z.

siVAE: interpretable deep generative models for single-cell transcriptomes.

Genome Biol. 2023 Feb 20;24(1):29. doi: 10.1186/s13059-023-02850-y.

Large-Scale Integrative Analysis of Soybean Transcriptome Using an Unsupervised Autoencoder Model.

Front Plant Sci. 2022 Mar 3;13:831204. doi: 10.3389/fpls.2022.831204. eCollection 2022.

An Overview of Algorithms and Associated Applications for Single Cell RNA-Seq Data Imputation.

Curr Genomics. 2021 Dec 30;22(5):319-327. doi: 10.2174/1389202921999200716104916.

本文引用的文献

Naught all zeros in sequence count data are the same.

Comput Struct Biotechnol J. 2020 Sep 28;18:2789-2798. doi: 10.1016/j.csbj.2020.09.014. eCollection 2020.

scVAE: variational auto-encoders for single-cell gene expression data.

Bioinformatics. 2020 Aug 15;36(16):4415-4422. doi: 10.1093/bioinformatics/btaa293.

DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data.

Genome Biol. 2019 Oct 18;20(1):211. doi: 10.1186/s13059-019-1837-6.

Single-cell RNA-seq denoising using a deep count autoencoder.

Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.

VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder.

Genomics Proteomics Bioinformatics. 2018 Oct;16(5):320-331. doi: 10.1016/j.gpb.2018.08.003. Epub 2018 Dec 18.

A test metric for assessing single-cell RNA-seq batch correction.

Nat Methods. 2019 Jan;16(1):43-49. doi: 10.1038/s41592-018-0254-1. Epub 2018 Dec 20.

Glutathione peroxidase 4 maintains a stemness phenotype, oxidative homeostasis and regulates biological processes in Panc‑1 cancer stem‑like cells.

Oncol Rep. 2019 Feb;41(2):1264-1274. doi: 10.3892/or.2018.6905. Epub 2018 Dec 6.

Deep generative modeling for single-cell transcriptomics.

Nat Methods. 2018 Dec;15(12):1053-1058. doi: 10.1038/s41592-018-0229-2. Epub 2018 Nov 30.

Detection of correlated hidden factors from single cell transcriptomes using Iteratively Adjusted-SVA (IA-SVA).

Sci Rep. 2018 Nov 19;8(1):17040. doi: 10.1038/s41598-018-35365-9.

AutoImpute: Autoencoder based imputation of single-cell RNA-seq data.

Sci Rep. 2018 Nov 5;8(1):16329. doi: 10.1038/s41598-018-34688-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

从单细胞 mRNA 测序数据中反卷积自动编码器以学习生物调节模块。

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译