School of Artificial Intelligence, Jilin University, Jilin, China.
Department of Computer science, City University of Hong Kong, Hong Kong SAR.
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab368.
Single-cell RNA sequencing (scRNA-seq) technologies have been heavily developed to probe gene expression profiles at single-cell resolution. Deep imputation methods have been proposed to address the related computational challenges (e.g. the gene sparsity in single-cell data). In particular, the neural architectures of those deep imputation models have been proven to be critical for performance. However, deep imputation architectures are difficult to design and tune for those without rich knowledge of deep neural networks and scRNA-seq. Therefore, Surrogate-assisted Evolutionary Deep Imputation Model (SEDIM) is proposed to automatically design the architectures of deep neural networks for imputing gene expression levels in scRNA-seq data without any manual tuning. Moreover, the proposed SEDIM constructs an offline surrogate model, which can accelerate the computational efficiency of the architectural search. Comprehensive studies show that SEDIM significantly improves the imputation and clustering performance compared with other benchmark methods. In addition, we also extensively explore the performance of SEDIM in other contexts and platforms including mass cytometry and metabolic profiling in a comprehensive manner. Marker gene detection, gene ontology enrichment and pathological analysis are conducted to provide novel insights into cell-type identification and the underlying mechanisms. The source code is available at https://github.com/li-shaochuan/SEDIM.
单细胞 RNA 测序 (scRNA-seq) 技术的发展极大地促进了单细胞分辨率下基因表达谱的探测。深度插补方法已被提出以解决相关的计算挑战(例如单细胞数据中的基因稀疏性)。特别是,这些深度插补模型的神经架构已被证明对性能至关重要。然而,对于那些没有丰富的深度学习网络和 scRNA-seq 知识的人来说,深度插补架构设计和调优是很困难的。因此,提出了基于代理的进化深度插补模型(SEDIM),用于在 scRNA-seq 数据中自动设计深度神经网络的架构,而无需任何手动调优来插补基因表达水平。此外,所提出的 SEDIM 构建了一个离线代理模型,可以加速架构搜索的计算效率。综合研究表明,SEDIM 与其他基准方法相比,显著提高了插补和聚类性能。此外,我们还全面地在其他上下文和平台(包括质谱流式细胞术和代谢谱分析)中探索了 SEDIM 的性能。通过进行标记基因检测、基因本体富集和病理分析,为细胞类型识别和潜在机制提供了新的见解。该源代码可在 https://github.com/li-shaochuan/SEDIM 上获得。