基于网络嵌入的单细胞RNA测序数据表示学习

Network embedding-based representation learning for single cell RNA-seq data.

作者信息

Li Xiangyu, Chen Weizheng, Chen Yang, Zhang Xuegong, Gu Jin, Zhang Michael Q

机构信息

MOE Key Laboratory of Bioinformatics, TNLIST Bioinformatics Division/Center for Synthetic & System Biology, Department of Automation, Tsinghua University, Beijing 100084, China.

Institute of Network Computing and Information System, Department of Computer Science, Peking University, Beijing 100871, China.

出版信息

Nucleic Acids Res. 2017 Nov 2;45(19):e166. doi: 10.1093/nar/gkx750.

DOI:10.1093/nar/gkx750

PMID:28977434

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5737094/

Abstract

Single cell RNA-seq (scRNA-seq) techniques can reveal valuable insights of cell-to-cell heterogeneities. Projection of high-dimensional data into a low-dimensional subspace is a powerful strategy in general for mining such big data. However, scRNA-seq suffers from higher noise and lower coverage than traditional bulk RNA-seq, hence bringing in new computational difficulties. One major challenge is how to deal with the frequent drop-out events. The events, usually caused by the stochastic burst effect in gene transcription and the technical failure of RNA transcript capture, often render traditional dimension reduction methods work inefficiently. To overcome this problem, we have developed a novel Single Cell Representation Learning (SCRL) method based on network embedding. This method can efficiently implement data-driven non-linear projection and incorporate prior biological knowledge (such as pathway information) to learn more meaningful low-dimensional representations for both cells and genes. Benchmark results show that SCRL outperforms other dimensional reduction methods on several recent scRNA-seq datasets.

摘要

单细胞RNA测序（scRNA-seq）技术能够揭示细胞间异质性的宝贵见解。将高维数据投影到低维子空间通常是挖掘此类大数据的有效策略。然而，与传统的批量RNA测序相比，scRNA-seq存在更高的噪声和更低的覆盖率，从而带来了新的计算难题。一个主要挑战是如何处理频繁出现的基因数据丢失事件。这些事件通常由基因转录中的随机爆发效应和RNA转录本捕获的技术故障引起，常常导致传统降维方法效率低下。为了克服这个问题，我们基于网络嵌入开发了一种新颖的单细胞表示学习（SCRL）方法。该方法能够高效地实现数据驱动的非线性投影，并纳入先验生物学知识（如通路信息），为细胞和基因学习更有意义的低维表示。基准测试结果表明，在最近的几个scRNA-seq数据集上，SCRL优于其他降维方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8458/5737094/556882186f3d/gkx750fig1.jpg

相似文献

Network embedding-based representation learning for single cell RNA-seq data.

Nucleic Acids Res. 2017 Nov 2;45(19):e166. doi: 10.1093/nar/gkx750.

Visualization of Single Cell RNA-Seq Data Using t-SNE in R.

Methods Mol Biol. 2020;2117:159-167. doi: 10.1007/978-1-0716-0301-7_8.

A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.

PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

scPADGRN: A preconditioned ADMM approach for reconstructing dynamic gene regulatory network using single-cell RNA sequencing data.

PLoS Comput Biol. 2020 Jul 27;16(7):e1007471. doi: 10.1371/journal.pcbi.1007471. eCollection 2020 Jul.

Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud.

Bioinformatics. 2017 Mar 1;33(5):767-769. doi: 10.1093/bioinformatics/btw732.

scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.

BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.

Data Analysis in Single-Cell Transcriptome Sequencing.

Methods Mol Biol. 2018;1754:311-326. doi: 10.1007/978-1-4939-7717-8_18.

scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.

PLoS Comput Biol. 2024 Jul 8;20(7):e1011620. doi: 10.1371/journal.pcbi.1011620. eCollection 2024 Jul.

TSEE: an elastic embedding method to visualize the dynamic gene expression patterns of time series single-cell RNA sequencing data.

BMC Genomics. 2019 Apr 4;20(Suppl 2):224. doi: 10.1186/s12864-019-5477-8.

scLINE: A multi-network integration framework based on network embedding for representation of single-cell RNA-seq data.

J Biomed Inform. 2021 Oct;122:103899. doi: 10.1016/j.jbi.2021.103899. Epub 2021 Sep 3.

引用本文的文献

Navigating single-cell RNA-sequencing: protocols, tools, databases, and applications.

Genomics Inform. 2025 May 17;23(1):13. doi: 10.1186/s44342-025-00044-5.

Robust evaluation of deep learning-based representation methods for survival and gene essentiality prediction on bulk RNA-seq data.

Sci Rep. 2024 Jul 24;14(1):17064. doi: 10.1038/s41598-024-67023-8.

MENDER: fast and scalable tissue structure identification in spatial omics data.

Nat Commun. 2024 Jan 5;15(1):207. doi: 10.1038/s41467-023-44367-9.

Research progress of single-cell sequencing in tuberculosis.

Front Immunol. 2023 Oct 13;14:1276194. doi: 10.3389/fimmu.2023.1276194. eCollection 2023.

Gene network inference from single-cell omics data and domain knowledge for constructing COVID-19-specific -associated pathways.

Front Genet. 2023 Aug 31;14:1250545. doi: 10.3389/fgene.2023.1250545. eCollection 2023.

Single-cell Transcriptomes Reveal Characteristics of MicroRNAs in Gene Expression Noise Reduction.

Genomics Proteomics Bioinformatics. 2021 Jun;19(3):394-407. doi: 10.1016/j.gpb.2021.05.002. Epub 2021 Oct 1.

Cell-specific gene association network construction from single-cell RNA sequence.

Cell Cycle. 2021 Nov;20(21):2248-2263. doi: 10.1080/15384101.2021.1978265. Epub 2021 Sep 16.

An overview of graph databases and their applications in the biomedical domain.

Database (Oxford). 2021 May 18;2021. doi: 10.1093/database/baab026.

Survey on graph embeddings and their applications to machine learning problems on graphs.

PeerJ Comput Sci. 2021 Feb 4;7:e357. doi: 10.7717/peerj-cs.357. eCollection 2021.

c-CSN: Single-cell RNA Sequencing Data Analysis by Conditional Cell-specific Network.

Genomics Proteomics Bioinformatics. 2021 Apr;19(2):319-329. doi: 10.1016/j.gpb.2020.05.005. Epub 2021 Mar 5.

本文引用的文献

Single-Cell RNA-Seq Reveals Lineage and X Chromosome Dynamics in Human Preimplantation Embryos.

Cell. 2016 May 5;165(4):1012-26. doi: 10.1016/j.cell.2016.03.023. Epub 2016 Apr 7.

Primordial Germ Cells: Current Knowledge and Perspectives.

Stem Cells Int. 2016;2016:1741072. doi: 10.1155/2016/1741072. Epub 2015 Nov 9.

ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis.

Genome Biol. 2015 Nov 2;16:241. doi: 10.1186/s13059-015-0805-z.

The Transcriptome and DNA Methylome Landscapes of Human Primordial Germ Cells.

Cell. 2015 Jun 4;161(6):1437-52. doi: 10.1016/j.cell.2015.05.015.

The technology and biology of single-cell RNA sequencing.

Mol Cell. 2015 May 21;58(4):610-20. doi: 10.1016/j.molcel.2015.04.005.

Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

Nat Biotechnol. 2014 Oct;32(10):1053-8. doi: 10.1038/nbt.2967. Epub 2014 Aug 3.

Quantitative assessment of single-cell RNA-sequencing methods.

Nat Methods. 2014 Jan;11(1):41-6. doi: 10.1038/nmeth.2694. Epub 2013 Oct 20.

IntPath--an integrated pathway gene relationship database for model organisms and important pathogens.

BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S2. doi: 10.1186/1752-0509-6-S2-S2. Epub 2012 Dec 12.

Retinoic acid derived from the fetal ovary initiates meiosis in mouse germ cells.

J Cell Physiol. 2013 Mar;228(3):627-39. doi: 10.1002/jcp.24172.

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.

Nat Protoc. 2012 Mar 1;7(3):562-78. doi: 10.1038/nprot.2012.016.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于网络嵌入的单细胞RNA测序数据表示学习

Network embedding-based representation learning for single cell RNA-seq data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献