Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA.
Bioinformatics. 2022 Nov 30;38(23):5322-5325. doi: 10.1093/bioinformatics/btac684.
Gene expression imputation has been an essential step of the single-cell RNA-Seq data analysis workflow. Among several deep-learning methods, the debut of scGNN gained substantial recognition in 2021 for its superior performance and the ability to produce a cell-cell graph. However, the implementation of scGNN was relatively time-consuming and its performance could still be optimized.
The implementation of scGNN 2.0 is significantly faster than scGNN thanks to a simplified close-loop architecture. For all eight datasets, cell clustering performance was increased by 85.02% on average in terms of adjusted rand index, and the imputation Median L1 Error was reduced by 67.94% on average. With the built-in visualizations, users can quickly assess the imputation and cell clustering results, compare against benchmarks and interpret the cell-cell interaction. The expanded input and output formats also pave the way for custom workflows that integrate scGNN 2.0 with other scRNA-Seq toolkits on both Python and R platforms.
scGNN 2.0 is implemented in Python (as of version 3.8) with the source code available at https://github.com/OSU-BMBL/scGNN2.0.
Supplementary data are available at Bioinformatics online.
基因表达推断是单细胞 RNA-Seq 数据分析工作流程的重要步骤。在几种深度学习方法中,scGNN 的首次亮相因其卓越的性能和生成细胞-细胞图的能力而在 2021 年获得了广泛认可。然而,scGNN 的实现相对耗时,其性能仍可优化。
由于简化的闭环架构,scGNN 2.0 的实现速度明显快于 scGNN。对于所有八个数据集,细胞聚类性能平均提高了 85.02%,调整后的兰德指数,平均中位数 L1 误差降低了 67.94%。通过内置的可视化,用户可以快速评估推断和细胞聚类结果,与基准进行比较并解释细胞-细胞相互作用。扩展的输入和输出格式还为自定义工作流程铺平了道路,这些工作流程将 scGNN 2.0 与 Python 和 R 平台上的其他 scRNA-Seq 工具包集成。
scGNN 2.0 是用 Python 实现的(截至 3.8 版),源代码可在 https://github.com/OSU-BMBL/scGNN2.0 上获得。
补充数据可在生物信息学在线获得。