• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于对抗域和变分逼近的通用可扩展单细胞数据集成算法。

A versatile and scalable single-cell data integration algorithm based on domain-adversarial and variational approximation.

机构信息

School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.

出版信息

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab400.

DOI:10.1093/bib/bbab400
PMID:34585247
Abstract

Single-cell technologies provide us new ways to profile transcriptomic landscape, chromatin accessibility, spatial expression patterns in heterogeneous tissues at the resolution of single cell. With enormous generated single-cell datasets, a key analytic challenge is to integrate these datasets to gain biological insights into cellular compositions. Here, we developed a domain-adversarial and variational approximation, DAVAE, which can integrate multiple single-cell datasets across samples, technologies and modalities with a single strategy. Besides, DAVAE can also integrate paired data of ATAC profile and transcriptome profile that are simultaneously measured from a same cell. With a mini-batch stochastic gradient descent strategy, it is scalable for large-scale data and can be accelerated by GPUs. Results on seven real data integration applications demonstrated the effectiveness and scalability of DAVAE in batch-effect removing, transfer learning and cell-type predictions for multiple single-cell datasets across samples, technologies and modalities. Availability: DAVAE has been implemented in a toolkit package "scbean" in the pypi repository, and the source code can be also freely accessible at https://github.com/jhu99/scbean. All our data and source code for reproducing the results of this paper can be accessible at https://github.com/jhu99/davae_paper.

摘要

单细胞技术为我们提供了新的方法来描绘异质组织中单细胞水平的转录组景观、染色质可及性和空间表达模式。随着大量单细胞数据集的产生,一个关键的分析挑战是整合这些数据集,以深入了解细胞组成。在这里,我们开发了一种域对抗和变分近似方法 DAVAE,它可以用一种单一的策略整合来自不同样本、技术和模态的多个单细胞数据集。此外,DAVAE 还可以整合同时从同一细胞中测量的 ATAC 图谱和转录组图谱的配对数据。通过小批量随机梯度下降策略,它可以扩展到大规模数据,并可以通过 GPU 加速。在七个真实数据集整合应用的结果中,展示了 DAVAE 在去除批次效应、跨样本、技术和模态的多个单细胞数据集的迁移学习和细胞类型预测方面的有效性和可扩展性。

可用性

DAVAE 已在 pypi 存储库中的“scbean”工具包中实现,其源代码也可在 https://github.com/jhu99/scbean 上免费访问。我们所有用于重现本文结果的数据和源代码都可以在 https://github.com/jhu99/davae_paper 上访问。

相似文献

1
A versatile and scalable single-cell data integration algorithm based on domain-adversarial and variational approximation.基于对抗域和变分逼近的通用可扩展单细胞数据集成算法。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab400.
2
Scbean: a python library for single-cell multi-omics data analysis.Scbean:一个用于单细胞多组学数据分析的 Python 库。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae053.
3
A multi-view latent variable model reveals cellular heterogeneity in complex tissues for paired multimodal single-cell data.多视角潜变量模型揭示了复杂组织中配对多模态单细胞数据的细胞异质性。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad005.
4
SMILE: mutual information learning for integration of single-cell omics data.SMILE:单细胞组学数据集成的互信息学习。
Bioinformatics. 2022 Jan 3;38(2):476-486. doi: 10.1093/bioinformatics/btab706.
5
Ensemble deep learning of embeddings for clustering multimodal single-cell omics data.基于嵌入的集成深度学习用于聚类多模态单细胞组学数据。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad382.
6
Scalable integration of multiomic single-cell data using generative adversarial networks.基于生成对抗网络的多组学单细胞数据可扩展整合。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae300.
7
scAMACE: model-based approach to the joint analysis of single-cell data on chromatin accessibility, gene expression and methylation.scAMACE:一种基于模型的方法,用于联合分析染色质可及性、基因表达和甲基化的单细胞数据。
Bioinformatics. 2021 Nov 5;37(21):3874-3880. doi: 10.1093/bioinformatics/btab426.
8
Dictionary learning for integrative, multimodal and scalable single-cell analysis.基于字典学习的综合、多模态和可扩展的单细胞分析。
Nat Biotechnol. 2024 Feb;42(2):293-304. doi: 10.1038/s41587-023-01767-y. Epub 2023 May 25.
9
scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier.scDREAMER:基于深度生成模型与对抗分类器的单细胞数据集图谱级整合方法。
Nat Commun. 2023 Nov 27;14(1):7781. doi: 10.1038/s41467-023-43590-8.
10
Integration and transfer learning of single-cell transcriptomes via cFIT.通过 cFIT 实现单细胞转录组的整合和迁移学习。
Proc Natl Acad Sci U S A. 2021 Mar 9;118(10). doi: 10.1073/pnas.2024383118.

引用本文的文献

1
RGCN-BA: relational graph convolutional network with batch awareness for single-cell RNA sequencing clustering.RGCN-BA:用于单细胞RNA测序聚类的具有批次感知的关系图卷积网络
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf378.
2
Inference of gene coexpression networks from single-cell transcriptome data based on variance decomposition analysis.基于方差分解分析从单细胞转录组数据推断基因共表达网络。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf309.
3
Deep learning in single-cell and spatial transcriptomics data analysis: advances and challenges from a data science perspective.
从数据科学视角看深度学习在单细胞和空间转录组学数据分析中的进展与挑战
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf136.
4
Integration of unpaired single cell omics data by deep transfer graph convolutional network.通过深度迁移图卷积网络整合非配对单细胞组学数据
PLoS Comput Biol. 2025 Jan 16;21(1):e1012625. doi: 10.1371/journal.pcbi.1012625. eCollection 2025 Jan.
5
Explorer: efficient DNA coding by De Bruijn graph toward arbitrary local and global biochemical constraints.探索者:基于 De Bruijn 图实现高效的 DNA 编码,满足任意局部和全局生化约束。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae363.
6
Detecting novel cell type in single-cell chromatin accessibility data via open-set domain adaptation.基于开放集领域自适应的单细胞染色质可及性数据中新细胞类型的检测。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae370.
7
Single-cell omics: experimental workflow, data analyses and applications.单细胞组学:实验工作流程、数据分析及应用
Sci China Life Sci. 2025 Jan;68(1):5-102. doi: 10.1007/s11427-023-2561-0. Epub 2024 Jul 23.
8
Enhancer-MDLF: a novel deep learning framework for identifying cell-specific enhancers.增强型 MDLF:一种用于识别细胞特异性增强子的新型深度学习框架。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae083.
9
Scbean: a python library for single-cell multi-omics data analysis.Scbean:一个用于单细胞多组学数据分析的 Python 库。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae053.
10
Integrating single-cell RNA-seq datasets with substantial batch effects.整合具有显著批次效应的单细胞RNA测序数据集。
bioRxiv. 2024 Feb 10:2023.11.03.565463. doi: 10.1101/2023.11.03.565463.