Xu Li, Xu Yin, Xue Tong, Zhang Xinyu, Li Jin
College of Computer Science and Technology, Harbin Engineering University, Harbin, China.
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.
Front Genet. 2021 Sep 8;12:739677. doi: 10.3389/fgene.2021.739677. eCollection 2021.
The emergence of single-cell RNA sequencing (scRNA-seq) technology has paved the way for measuring RNA levels at single-cell resolution to study precise biological functions. However, the presence of a large number of missing values in its data will affect downstream analysis. This paper presents AdImpute: an imputation method based on semi-supervised autoencoders. The method uses another imputation method (DrImpute is used as an example) to fill the results as imputation weights of the autoencoder, and applies the cost function with imputation weights to learn the latent information in the data to achieve more accurate imputation. As shown in clustering experiments with the simulated data sets and the real data sets, AdImpute is more accurate than other four publicly available scRNA-seq imputation methods, and minimally modifies the biologically silent genes. Overall, AdImpute is an accurate and robust imputation method.
单细胞RNA测序(scRNA-seq)技术的出现为在单细胞分辨率下测量RNA水平以研究精确的生物学功能铺平了道路。然而,其数据中存在大量缺失值会影响下游分析。本文提出了AdImpute:一种基于半监督自动编码器的插补方法。该方法使用另一种插补方法(以DrImpute为例)将结果作为自动编码器的插补权重进行填充,并应用带有插补权重的代价函数来学习数据中的潜在信息,以实现更准确的插补。如在模拟数据集和真实数据集的聚类实验中所示,AdImpute比其他四种公开可用的scRNA-seq插补方法更准确,并且对生物学上沉默的基因修改最小。总体而言,AdImpute是一种准确且稳健的插补方法。