Yeh Chia-Hung, Chen Ze-Guang, Liou Cheng-Yue, Chen Mei-Juan
Department of Electrical Engineering, National Taiwan Normal University, Taipei 10610, Taiwan.
Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan.
Bioengineering (Basel). 2023 Aug 23;10(9):996. doi: 10.3390/bioengineering10090996.
Predicting cellular responses to perturbations is an unsolved problem in biology. Traditional approaches assume that different cell types respond similarly to perturbations. However, this assumption does not take into account the context of genome interactions in different cell types, which leads to compromised prediction quality. More recently, deep learning models used to discover gene-gene relationships can yield more accurate predictions of cellular responses. The huge difference in biological information between different cell types makes it difficult for deep learning models to encode data into a continuous low-dimensional feature space, which means that the features captured by the latent space may not be continuous. Therefore, the mapping relationship between the two conditional spaces learned by the model can only be applied where the real reference data resides, leading to the wrong mapping of the predicted target cells because they are not in the same domain as the reference data. In this paper, we propose an information-navigated variational autoencoder (INVAE), a deep neural network for cell perturbation response prediction. INVAE filters out information that is not conducive to predictive performance. For the remaining information, INVAE constructs a homogeneous space of control conditions, and finds the mapping relationship between the control condition space and the perturbation condition space. By embedding the target unit into the control space and then mapping it to the perturbation space, we can predict the perturbed state of the target unit. Comparing our proposed method with other three state-of-the-art methods on three real datasets, experimental results show that INVAE outperforms existing methods in cell state prediction after perturbation. Furthermore, we demonstrate that filtering out useless information not only improves prediction accuracy but also reveals similarities in how genes in different cell types are regulated following perturbation.
预测细胞对扰动的反应是生物学中一个尚未解决的问题。传统方法假设不同细胞类型对扰动的反应相似。然而,这一假设没有考虑到不同细胞类型中基因组相互作用的背景,从而导致预测质量受损。最近,用于发现基因-基因关系的深度学习模型能够对细胞反应做出更准确的预测。不同细胞类型之间生物信息的巨大差异使得深度学习模型难以将数据编码到连续的低维特征空间中,这意味着潜在空间捕获的特征可能不连续。因此,模型学习到的两个条件空间之间的映射关系只能应用于真实参考数据所在的地方,导致预测目标细胞的映射错误,因为它们与参考数据不在同一域中。在本文中,我们提出了一种信息导航变分自编码器(INVAE),这是一种用于细胞扰动反应预测的深度神经网络。INVAE过滤掉不利于预测性能的信息。对于剩余信息,INVAE构建一个控制条件的均匀空间,并找到控制条件空间和扰动条件空间之间的映射关系。通过将目标单元嵌入到控制空间中,然后将其映射到扰动空间,我们可以预测目标单元的扰动状态。在三个真实数据集上,将我们提出的方法与其他三种最先进的方法进行比较,实验结果表明,INVAE在扰动后细胞状态预测方面优于现有方法。此外,我们证明,过滤掉无用信息不仅提高了预测准确性,还揭示了不同细胞类型中的基因在扰动后如何被调控的相似性。