基于非线性典范相关分析的高效可扩展单细胞数据比对。

Effective and scalable single-cell data alignment with non-linear canonical correlation analysis.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

Department of Human Genetics and Department of Medicine, University of Chicago, Chicago, IL 60637, USA.

出版信息

Nucleic Acids Res. 2022 Feb 28;50(4):e21. doi: 10.1093/nar/gkab1147.

DOI:10.1093/nar/gkab1147

PMID:34871454

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8887421/

Abstract

Data alignment is one of the first key steps in single cell analysis for integrating multiple datasets and performing joint analysis across studies. Data alignment is challenging in extremely large datasets, however, as the major of the current single cell data alignment methods are not computationally efficient. Here, we present VIPCCA, a computational framework based on non-linear canonical correlation analysis for effective and scalable single cell data alignment. VIPCCA leverages both deep learning for effective single cell data modeling and variational inference for scalable computation, thus enabling powerful data alignment across multiple samples, multiple data platforms, and multiple data types. VIPCCA is accurate for a range of alignment tasks including alignment between single cell RNAseq and ATACseq datasets and can easily accommodate millions of cells, thereby providing researchers unique opportunities to tackle challenges emerging from large-scale single-cell atlas.

摘要

数据对齐是单细胞分析中整合多个数据集和跨研究进行联合分析的首要关键步骤之一。然而，在极其大型的数据集上，数据对齐具有挑战性，因为当前大多数单细胞数据对齐方法在计算上效率不高。在这里，我们提出了 VIPCCA，这是一个基于非线性典型相关分析的计算框架，用于有效的和可扩展的单细胞数据对齐。VIPCCA 利用深度学习进行有效的单细胞数据建模和变分推断进行可扩展的计算，从而能够在多个样本、多个数据平台和多个数据类型之间进行强大的数据对齐。VIPCCA 在一系列对齐任务中都具有准确性，包括单细胞 RNAseq 和 ATACseq 数据集之间的对齐，并且可以轻松处理数百万个细胞，从而为研究人员提供了独特的机会来应对大规模单细胞图谱中出现的挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf3f/8887421/137af6ffc65a/gkab1147fig1.jpg

相似文献

Effective and scalable single-cell data alignment with non-linear canonical correlation analysis.

Nucleic Acids Res. 2022 Feb 28;50(4):e21. doi: 10.1093/nar/gkab1147.

Adversarial domain translation networks for integrating large-scale atlas-level single-cell datasets.

Nat Comput Sci. 2022 May;2(5):317-330. doi: 10.1038/s43588-022-00251-y. Epub 2022 May 30.

Alignment of single-cell trajectory trees with CAPITAL.

Nat Commun. 2022 Oct 14;13(1):5972. doi: 10.1038/s41467-022-33681-3.

Integrated analysis of multimodal single-cell data with structural similarity.

Nucleic Acids Res. 2022 Nov 28;50(21):e121. doi: 10.1093/nar/gkac781.

Multi-omics single-cell data integration and regulatory inference with graph-linked embedding.

Nat Biotechnol. 2022 Oct;40(10):1458-1466. doi: 10.1038/s41587-022-01284-4. Epub 2022 May 2.

Cloud accelerated alignment and assembly of full-length single-cell RNA-seq data using Falco.

BMC Genomics. 2019 Dec 30;20(Suppl 10):927. doi: 10.1186/s12864-019-6341-6.

A versatile and scalable single-cell data integration algorithm based on domain-adversarial and variational approximation.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab400.

Creation of a Single Cell RNASeq Meta-Atlas to Define Human Liver Immune Homeostasis.

Front Immunol. 2021 Jul 16;12:679521. doi: 10.3389/fimmu.2021.679521. eCollection 2021.

MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data.

Genome Biol. 2020 May 11;21(1):111. doi: 10.1186/s13059-020-02015-1.

scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning.

Nat Biotechnol. 2022 May;40(5):703-710. doi: 10.1038/s41587-021-01161-6. Epub 2022 Jan 20.

引用本文的文献

A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.

Inference of gene coexpression networks from single-cell transcriptome data based on variance decomposition analysis.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf309.

scMODAL: a general deep learning framework for comprehensive single-cell multi-omics data alignment with feature links.

Nat Commun. 2025 May 29;16(1):4994. doi: 10.1038/s41467-025-60333-z.

Exploring and mitigating shortcomings in single-cell differential expression analysis with a new statistical paradigm.

Genome Biol. 2025 Mar 17;26(1):58. doi: 10.1186/s13059-025-03525-6.

Pathway Activation Analysis for Pan-Cancer Personalized Characterization Based on Riemannian Manifold.

Int J Mol Sci. 2024 Apr 17;25(8):4411. doi: 10.3390/ijms25084411.

Scbean: a python library for single-cell multi-omics data analysis.

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae053.

Challenges and opportunities to computationally deconvolve heterogeneous tissue with varying cell sizes using single-cell RNA-sequencing datasets.

Genome Biol. 2023 Dec 14;24(1):288. doi: 10.1186/s13059-023-03123-4.

A multi-view latent variable model reveals cellular heterogeneity in complex tissues for paired multimodal single-cell data.

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad005.

Identifying driver modules based on multi-omics biological networks in prostate cancer.

IET Syst Biol. 2022 Dec;16(6):187-200. doi: 10.1049/syb2.12050. Epub 2022 Aug 30.

Revealing the Key MSCs Niches and Pathogenic Genes in Influencing CEP Homeostasis: A Conjoint Analysis of Single-Cell and WGCNA.

Front Immunol. 2022 Jun 27;13:933721. doi: 10.3389/fimmu.2022.933721. eCollection 2022.

本文引用的文献

A versatile and scalable single-cell data integration algorithm based on domain-adversarial and variational approximation.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab400.

SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies.

Genome Biol. 2021 Jun 21;22(1):184. doi: 10.1186/s13059-021-02404-0.

Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing.

Nat Methods. 2021 Jun;18(6):635-642. doi: 10.1038/s41592-021-01153-z. Epub 2021 May 31.

Iterative single-cell multi-omic integration using online learning.

Nat Biotechnol. 2021 Aug;39(8):1000-1007. doi: 10.1038/s41587-021-00867-x. Epub 2021 Apr 19.

Alignment of single-cell RNA-seq samples without overcorrection using kernel density matching.

Genome Res. 2021 Apr;31(4):698-712. doi: 10.1101/gr.261115.120. Epub 2021 Mar 19.

Joint probabilistic modeling of single-cell multi-omic data with totalVI.

Nat Methods. 2021 Mar;18(3):272-282. doi: 10.1038/s41592-020-01050-x. Epub 2021 Feb 15.

Demystifying "drop-outs" in single-cell UMI data.

Genome Biol. 2020 Aug 6;21(1):196. doi: 10.1186/s13059-020-02096-y.

A scalable SCENIC workflow for single-cell gene regulatory network analysis.

Nat Protoc. 2020 Jul;15(7):2247-2276. doi: 10.1038/s41596-020-0336-2. Epub 2020 Jun 19.

Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis.

Nat Commun. 2020 May 11;11(1):2338. doi: 10.1038/s41467-020-15851-3.

Interpretable factor models of single-cell RNA-seq via variational autoencoders.

Bioinformatics. 2020 Jun 1;36(11):3418-3421. doi: 10.1093/bioinformatics/btaa169.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于非线性典范相关分析的高效可扩展单细胞数据比对。

Effective and scalable single-cell data alignment with non-linear canonical correlation analysis.

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

Department of Human Genetics and Department of Medicine, University of Chicago, Chicago, IL 60637, USA.

出版信息

Nucleic Acids Res. 2022 Feb 28;50(4):e21. doi: 10.1093/nar/gkab1147.

DOI:10.1093/nar/gkab1147

PMID:34871454

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8887421/

Abstract

摘要

基于非线性典范相关分析的高效可扩展单细胞数据比对。

Effective and scalable single-cell data alignment with non-linear canonical correlation analysis.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于非线性典范相关分析的高效可扩展单细胞数据比对。

Effective and scalable single-cell data alignment with non-linear canonical correlation analysis.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献