Jin Ke, Li Bo, Yan Hong, Zhang Xiao-Fei
Department of Statistics, School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China.
Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan 430079, China.
Bioinformatics. 2022 Jun 13;38(12):3222-3230. doi: 10.1093/bioinformatics/btac300.
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) technologies have been testified revolutionary for their promotion on the profiling of single-cell transcriptomes at single-cell resolution. Excess zeros due to various technical noises, called dropouts, will mislead downstream analyses. Therefore, it is crucial to have accurate imputation methods to address the dropout problem. RESULTS: In this article, we develop a new dropout imputation method for scRNA-seq data based on multi-objective optimization. Our method is different from existing ones, which assume that the underlying data has a preconceived structure and impute the dropouts according to the information learned from such structure. We assume that the data combines three types of latent structures, including the horizontal structure (genes are similar to each other), the vertical structure (cells are similar to each other) and the low-rank structure. The combination weights and latent structures are learned using multi-objective optimization. And, the weighted average of the observed data and the imputation results learned from the three types of structures are considered as the final result. Comprehensive downstream experiments show the superiority of our method in terms of recovery of true gene expression profiles, differential expression analysis, cell clustering and cell trajectory inference. AVAILABILITY AND IMPLEMENTATION: The R package is available at https://github.com/Zhangxf-ccnu/scMOO and https://zenodo.org/record/5785195. The codes to reproduce the downstream analyses in this article can be found at https://github.com/Zhangxf-ccnu/scMOO_experiments_codes and https://zenodo.org/record/5786211. The detailed list of data sets used in the present study is represented in Supplementary Table S1 in the Supplementary materials. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Bioinformatics. 2022-6-13
Bioinformatics. 2019-11-1
Bioinformatics. 2023-12-1
Bioinformatics. 2022-9-30
Bioinformatics. 2020-5-1
Bioinformatics. 2020-5-1
Bioinformatics. 2020-6-1
Comput Biol Med. 2022-7
Comput Biol Med. 2023-9
J Chem Inf Model. 2025-3-10
Brief Bioinform. 2024-9-23
Sci China Life Sci. 2025-1
Bioinformatics. 2023-12-1
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2023-8-25