Shanghai Key Lab of Intelligent Information Processing, Shanghai, China.
School of Computer Science and Technology, Fudan University, Shanghai, China.
Bioinformatics. 2022 Jun 27;38(13):3377-3384. doi: 10.1093/bioinformatics/btac357.
Rapid developments of single-cell RNA sequencing technologies allow study of responses to external perturbations at individual cell level. However, in many cases, it is hard to collect the perturbed cells, such as knowing the response of a cell type to the drug before actual medication to a patient. Prediction in silicon could alleviate the problem and save cost. Although several tools have been developed, their prediction accuracy leaves much room for improvement.
In this article, we propose scPreGAN (Single-Cell data Prediction base on GAN), a deep generative model for predicting the response of single-cell expression to perturbation. ScPreGAN integrates autoencoder and generative adversarial network, the former is to extract common information of the unperturbed data and the perturbed data, the latter is to predict the perturbed data. Experiments on three real datasets show that scPreGAN outperforms three state-of-the-art methods, which can capture the complicated distribution of cell expression and generate the prediction data with the same expression abundance as the real data.
The implementation of scPreGAN is available via https://github.com/JaneJiayiDong/scPreGAN. To reproduce the results of this article, please visit https://github.com/JaneJiayiDong/scPreGAN-reproducibility.
Supplementary data are available at Bioinformatics online.
单细胞 RNA 测序技术的快速发展使得能够在单细胞水平上研究对外界扰动的反应。然而,在许多情况下,很难收集受扰细胞,例如在给患者实际用药之前,就知道某种细胞类型对药物的反应。硅基预测可以缓解这个问题并节省成本。尽管已经开发了几种工具,但它们的预测准确性仍有很大的改进空间。
在本文中,我们提出了 scPreGAN(基于 GAN 的单细胞数据预测),这是一种用于预测单细胞表达对扰动反应的深度生成模型。scPreGAN 集成了自动编码器和生成对抗网络,前者用于提取未受扰数据和受扰数据的共同信息,后者用于预测受扰数据。在三个真实数据集上的实验表明,scPreGAN 优于三种最先进的方法,它可以捕捉到细胞表达的复杂分布,并生成与真实数据具有相同表达丰度的预测数据。
scPreGAN 的实现可通过 https://github.com/JaneJiayiDong/scPreGAN 获取。要重现本文的结果,请访问 https://github.com/JaneJiayiDong/scPreGAN-reproducibility。
补充数据可在生物信息学在线获得。