scGNN 2.0：一种用于单细胞 RNA-Seq 数据插补和聚类的图神经网络工具。

scGNN 2.0: a graph neural network tool for imputation and clustering of single-cell RNA-Seq data.

机构信息

Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.

Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Bioinformatics. 2022 Nov 30;38(23):5322-5325. doi: 10.1093/bioinformatics/btac684.

DOI:10.1093/bioinformatics/btac684

PMID:36250784

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9710550/

Abstract

MOTIVATION

Gene expression imputation has been an essential step of the single-cell RNA-Seq data analysis workflow. Among several deep-learning methods, the debut of scGNN gained substantial recognition in 2021 for its superior performance and the ability to produce a cell-cell graph. However, the implementation of scGNN was relatively time-consuming and its performance could still be optimized.

RESULTS

The implementation of scGNN 2.0 is significantly faster than scGNN thanks to a simplified close-loop architecture. For all eight datasets, cell clustering performance was increased by 85.02% on average in terms of adjusted rand index, and the imputation Median L1 Error was reduced by 67.94% on average. With the built-in visualizations, users can quickly assess the imputation and cell clustering results, compare against benchmarks and interpret the cell-cell interaction. The expanded input and output formats also pave the way for custom workflows that integrate scGNN 2.0 with other scRNA-Seq toolkits on both Python and R platforms.

AVAILABILITY AND IMPLEMENTATION

scGNN 2.0 is implemented in Python (as of version 3.8) with the source code available at https://github.com/OSU-BMBL/scGNN2.0.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基因表达推断是单细胞 RNA-Seq 数据分析工作流程的重要步骤。在几种深度学习方法中，scGNN 的首次亮相因其卓越的性能和生成细胞-细胞图的能力而在 2021 年获得了广泛认可。然而，scGNN 的实现相对耗时，其性能仍可优化。

结果

由于简化的闭环架构，scGNN 2.0 的实现速度明显快于 scGNN。对于所有八个数据集，细胞聚类性能平均提高了 85.02%，调整后的兰德指数，平均中位数 L1 误差降低了 67.94%。通过内置的可视化，用户可以快速评估推断和细胞聚类结果，与基准进行比较并解释细胞-细胞相互作用。扩展的输入和输出格式还为自定义工作流程铺平了道路，这些工作流程将 scGNN 2.0 与 Python 和 R 平台上的其他 scRNA-Seq 工具包集成。

可用性和实现

scGNN 2.0 是用 Python 实现的（截至 3.8 版），源代码可在 https://github.com/OSU-BMBL/scGNN2.0 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

scGNN 2.0: a graph neural network tool for imputation and clustering of single-cell RNA-Seq data.scGNN 2.0：一种用于单细胞 RNA-Seq 数据插补和聚类的图神经网络工具。

Bioinformatics. 2022 Nov 30;38(23):5322-5325. doi: 10.1093/bioinformatics/btac684.

scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses.scGNN 是一种用于单细胞 RNA-Seq 分析的新型图神经网络框架。

Nat Commun. 2021 Mar 25;12(1):1882. doi: 10.1038/s41467-021-22197-x.

GE-Impute: graph embedding-based imputation for single-cell RNA-seq data.GE-Impute：基于图嵌入的单细胞 RNA-seq 数据插补。

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac313.

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

scGCL: an imputation method for scRNA-seq data based on graph contrastive learning.scGCL：一种基于图对比学习的 scRNA-seq 数据插补方法。

Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad098.

Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.

scGAC: a graph attentional architecture for clustering single-cell RNA-seq data.scGAC：一种用于聚类单细胞 RNA-seq 数据的图注意力架构。

Bioinformatics. 2022 Apr 12;38(8):2187-2193. doi: 10.1093/bioinformatics/btac099.

FlowGrid enables fast clustering of very large single-cell RNA-seq data.FlowGrid能够对非常大的单细胞RNA测序数据进行快速聚类。

Bioinformatics. 2021 Dec 22;38(1):282-283. doi: 10.1093/bioinformatics/btab521.

Single-cell RNA-seq data analysis based on directed graph neural network.基于有向图神经网络的单细胞RNA测序数据分析

Methods. 2023 Mar;211:48-60. doi: 10.1016/j.ymeth.2023.02.008. Epub 2023 Feb 16.

GNN-based embedding for clustering scRNA-seq data.基于图神经网络的 scRNA-seq 数据聚类嵌入方法。

Bioinformatics. 2022 Jan 27;38(4):1037-1044. doi: 10.1093/bioinformatics/btab787.

引用本文的文献

A Novel Dual-Level Momentum Distillation Method with Extreme Thresholding for Imputing Single-Cell RNA Sequencing Data.一种用于插补单细胞RNA测序数据的具有极端阈值化的新型双级动量蒸馏方法。

Interdiscip Sci. 2025 Aug 21. doi: 10.1007/s12539-025-00754-y.

A survey of biclustering and clustering methods in clustering different types of single-cell RNA sequencing data.关于在对不同类型的单细胞RNA测序数据进行聚类时的双聚类和聚类方法的一项调查。

Brief Funct Genomics. 2025 Jan 15;24. doi: 10.1093/bfgp/elaf010.

Graph neural networks for single-cell omics data: a review of approaches and applications.用于单细胞组学数据的图神经网络：方法与应用综述

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf109.

AcImpute: a constraint-enhancing smooth-based approach for imputing single-cell RNA sequencing data.AcImpute：一种用于估算单细胞RNA测序数据的基于约束增强平滑的方法。

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btae711.

A Generalized Bayesian Stochastic Block Model for Microbiome Community Detection.用于微生物群落检测的广义贝叶斯随机块模型

Stat Med. 2025 Feb 10;44(3-4):e10291. doi: 10.1002/sim.10291.

Comprehensive evaluation and practical guideline of gating methods for high-dimensional cytometry data: manual gating, unsupervised clustering, and auto-gating.高维细胞计数数据门控方法的综合评估与实用指南：手工门控、无监督聚类和自动门控。

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae633.

scVGATAE: A Variational Graph Attentional Autoencoder Model for Clustering Single-Cell RNA-seq Data.scVGATAE：一种用于单细胞RNA测序数据聚类的变分图注意力自动编码器模型

Biology (Basel). 2024 Sep 11;13(9):713. doi: 10.3390/biology13090713.

scVIC: deep generative modeling of heterogeneity for scRNA-seq data.scVIC：用于scRNA-seq数据异质性的深度生成建模

Bioinform Adv. 2024 Jun 13;4(1):vbae086. doi: 10.1093/bioadv/vbae086. eCollection 2024.

Representing and extracting knowledge from single-cell data.从单细胞数据中表示和提取知识。

Biophys Rev. 2023 Aug 5;16(1):29-56. doi: 10.1007/s12551-023-01091-4. eCollection 2024 Feb.

scQA: A dual-perspective cell type identification model for single cell transcriptome data.scQA：一种用于单细胞转录组数据的双视角细胞类型识别模型。

Comput Struct Biotechnol J. 2023 Dec 21;23:520-536. doi: 10.1016/j.csbj.2023.12.021. eCollection 2024 Dec.

本文引用的文献

Single-cell biological network inference using a heterogeneous graph transformer.基于异质图 Transformer 的单细胞生物网络推断

Nat Commun. 2023 Feb 21;14(1):964. doi: 10.1038/s41467-023-36559-0.

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.智慧人图谱：人类多器官单细胞转录组图谱。

Science. 2022 May 13;376(6594):eabl4896. doi: 10.1126/science.abl4896.

Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data.通过整合批量和单细胞测序数据来识别表型相关的亚群。

Nat Biotechnol. 2022 Apr;40(4):527-538. doi: 10.1038/s41587-021-01091-3. Epub 2021 Nov 11.

Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。

Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.

scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses.scGNN 是一种用于单细胞 RNA-Seq 分析的新型图神经网络框架。

Nat Commun. 2021 Mar 25;12(1):1882. doi: 10.1038/s41467-021-22197-x.

Eleven grand challenges in single-cell data science.单细胞数据科学的 11 大挑战。

Genome Biol. 2020 Feb 7;21(1):31. doi: 10.1186/s13059-020-1926-6.

LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data.LTMG：一种单细胞 RNA-Seq 数据中转录表达状态的新型统计建模方法。

Nucleic Acids Res. 2019 Oct 10;47(18):e111. doi: 10.1093/nar/gkz655.

Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践：教程。

Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.

SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data.SCRABBLE：基于批量 RNA-seq 数据约束的单细胞 RNA-seq 推断。

Genome Biol. 2019 May 6;20(1):88. doi: 10.1186/s13059-019-1681-8.

Recovering Gene Interactions from Single-Cell Data Using Data Diffusion.利用数据扩散从单细胞数据中恢复基因相互作用。

Cell. 2018 Jul 26;174(3):716-729.e27. doi: 10.1016/j.cell.2018.05.061. Epub 2018 Jun 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验