School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, Shaanxi, China.
Key Laboratory of Big Data Storage and Management, MIIT, Ministry of Industry and Information Technology, Xi'an 710129, Shaanxi, China.
Brief Funct Genomics. 2024 Mar 20;23(2):118-127. doi: 10.1093/bfgp/elac056.
Analysis of cell-cell communication (CCC) in the tumor micro-environment helps decipher the underlying mechanism of cancer progression and drug tolerance. Currently, single-cell RNA-Seq data are available on a large scale, providing an unprecedented opportunity to predict cellular communications. There have been many achievements and applications in inferring cell-cell communication based on the known interactions between molecules, such as ligands, receptors and extracellular matrix. However, the prior information is not quite adequate and only involves a fraction of cellular communications, producing many false-positive or false-negative results. To this end, we propose an improved hierarchical variational autoencoder (HiVAE) based model to fully use single-cell RNA-seq data for automatically estimating CCC. Specifically, the HiVAE model is used to learn the potential representation of cells on known ligand-receptor genes and all genes in single-cell RNA-seq data, respectively, which are then utilized for cascade integration. Subsequently, transfer entropy is employed to measure the transmission of information flow between two cells based on the learned representations, which are regarded as directed communication relationships. Experiments are conducted on single-cell RNA-seq data of the human skin disease dataset and the melanoma dataset, respectively. Results show that the HiVAE model is effective in learning cell representations, and transfer entropy could be used to estimate the communication scores between cell types.
分析肿瘤微环境中的细胞间通讯(CCC)有助于破译癌症进展和药物耐受的潜在机制。目前,大规模提供了单细胞 RNA-Seq 数据,为预测细胞通讯提供了前所未有的机会。基于分子(如配体、受体和细胞外基质)之间已知的相互作用来推断细胞间通讯已经取得了许多成就和应用。然而,先验信息并不十分充分,并且仅涉及细胞通讯的一部分,从而产生了许多假阳性或假阴性结果。为此,我们提出了一种改进的基于层次变分自动编码器(HiVAE)的模型,以充分利用单细胞 RNA-seq 数据自动估计 CCC。具体来说,HiVAE 模型用于分别学习已知配体-受体基因和单细胞 RNA-seq 数据中所有基因上细胞的潜在表示,然后用于级联集成。随后,基于学习到的表示使用转移熵来测量两个细胞之间信息流的传递,这些表示被视为有向通信关系。在人类皮肤病数据集和黑色素瘤数据集的单细胞 RNA-seq 数据上进行了实验。结果表明,HiVAE 模型在学习细胞表示方面是有效的,并且可以使用转移熵来估计细胞类型之间的通信得分。