Suppr超能文献

基于单细胞测序数据训练的可解释自动编码器可直接迁移到未见组织的数据。

Interpretable Autoencoders Trained on Single Cell Sequencing Data Can Transfer Directly to Data from Unseen Tissues.

机构信息

Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, Denmark.

Center for Surgical Science, Zealand University Hospital, Lykkebækvej 1, 4600 Koege, Denmark.

出版信息

Cells. 2021 Dec 28;11(1):85. doi: 10.3390/cells11010085.

Abstract

Autoencoders have been used to model single-cell mRNA-sequencing data with the purpose of denoising, visualization, data simulation, and dimensionality reduction. We, and others, have shown that autoencoders can be explainable models and interpreted in terms of biology. Here, we show that such autoencoders can generalize to the extent that they can transfer directly without additional training. In practice, we can extract biological modules, denoise, and classify data correctly from an autoencoder that was trained on a different dataset and with different cells (a foreign model). We deconvoluted the biological signal encoded in the bottleneck layer of scRNA-models using saliency maps and mapped salient features to biological pathways. Biological concepts could be associated with specific nodes and interpreted in relation to biological pathways. Even in this unsupervised framework, with no prior information about cell types or labels, the specific biological pathways deduced from the model were in line with findings in previous research. It was hypothesized that autoencoders could learn and represent meaningful biology; here, we show with a systematic experiment that this is true and even transcends the training data. This means that carefully trained autoencoders can be used to assist the interpretation of new unseen data.

摘要

自编码器已被用于对单细胞 mRNA 测序数据进行建模,目的是降噪、可视化、数据模拟和降维。我们和其他人已经表明,自编码器可以是可解释的模型,并可以从生物学的角度进行解释。在这里,我们表明,这种自编码器可以进行泛化,以至于它们可以直接转移而无需额外的训练。在实践中,我们可以从一个在不同数据集和不同细胞(外国模型)上训练的自编码器中提取生物学模块、降噪并正确分类数据。我们使用显着性图对 scRNA 模型的瓶颈层中编码的生物学信号进行去卷积,并将显着特征映射到生物学途径。可以将生物学概念与特定节点相关联,并根据生物学途径进行解释。即使在这个没有关于细胞类型或标签的先验信息的无监督框架中,从模型推断出的特定生物学途径也与之前的研究结果一致。假设自编码器可以学习和表示有意义的生物学;在这里,我们通过系统的实验表明这是正确的,甚至超越了训练数据。这意味着经过精心训练的自编码器可以用于协助解释新的未见数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81bd/8750521/834bb8155550/cells-11-00085-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验