ResPAN：通过残差对抗网络对 scRNA-seq 数据进行强大的批量校正模型。

ResPAN: a powerful batch correction model for scRNA-seq data through residual adversarial networks.

机构信息

Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06520, USA.

Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

出版信息

Bioinformatics. 2022 Aug 10;38(16):3942-3949. doi: 10.1093/bioinformatics/btac427.

DOI:10.1093/bioinformatics/btac427

PMID:35771600

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9364370/

Abstract

MOTIVATION

With the advancement of technology, we can generate and access large-scale, high dimensional and diverse genomics data, especially through single-cell RNA sequencing (scRNA-seq). However, integrative downstream analysis from multiple scRNA-seq datasets remains challenging due to batch effects.

RESULTS

In this article, we propose a light-structured deep learning framework called ResPAN for scRNA-seq data integration. ResPAN is based on Wasserstein Generative Adversarial Network (WGAN) combined with random walk mutual nearest neighbor pairing and fully skip-connected autoencoders to reduce the differences among batches. We also discuss the limitations of existing methods and demonstrate the advantages of our model over seven other methods through extensive benchmarking studies on both simulated data under various scenarios and real datasets across different scales. Our model achieves leading performance on both batch correction and biological information conservation and maintains scalable to datasets with over half a million cells.

AVAILABILITY AND IMPLEMENTATION

An open-source implementation of ResPAN and scripts to reproduce the results can be downloaded from: https://github.com/AprilYuge/ResPAN.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

随着技术的进步，我们可以生成和访问大规模、高维且多样化的基因组学数据，特别是通过单细胞 RNA 测序 (scRNA-seq)。然而，由于批次效应，来自多个 scRNA-seq 数据集的综合下游分析仍然具有挑战性。

结果

在本文中，我们提出了一种名为 ResPAN 的轻结构深度学习框架，用于 scRNA-seq 数据集成。ResPAN 基于 Wasserstein 生成对抗网络 (WGAN)，结合随机游走互最近邻配对和全跳过连接自动编码器，以减少批次之间的差异。我们还讨论了现有方法的局限性，并通过在各种场景下的模拟数据和不同规模的真实数据集上进行广泛的基准研究，展示了我们的模型相对于其他七种方法的优势。我们的模型在批次校正和生物信息保留方面都具有领先性能，并能够扩展到超过五十万个细胞的数据集。

可用性和实现

ResPAN 的开源实现和重现结果的脚本可从以下网址下载：https://github.com/AprilYuge/ResPAN。

补充信息

补充数据可在生物信息学在线获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

ResPAN：通过残差对抗网络对 scRNA-seq 数据进行强大的批量校正模型。

ResPAN: a powerful batch correction model for scRNA-seq data through residual adversarial networks.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

ResPAN：通过残差对抗网络对 scRNA-seq 数据进行强大的批量校正模型。

ResPAN: a powerful batch correction model for scRNA-seq data through residual adversarial networks.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献