Wang Kai, Li Yulong, Liu Fei, Luan Xiaoli, Wang Xinglong, Zhou Jingwen
Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, 1800 Lihu Road, Wuxi, 214122, Jiangsu, China.
Science Center for Future Foods, Jiangnan University, 1800 Lihu Road, Wuxi, 214122, Jiangsu, China.
BMC Bioinformatics. 2025 Apr 18;26(1):108. doi: 10.1186/s12859-025-06116-1.
A gene regulatory network (GRN) is a graph-level representation that describes the regulatory relationships between transcription factors and target genes in cells. The reconstruction of GRNs can help investigate cellular dynamics, drug design, and metabolic systems, and the rapid development of single-cell RNA sequencing (scRNA-seq) technology provides important opportunities while posing significant challenges for reconstructing GRNs. A number of methods for inferring GRNs have been proposed in recent years based on traditional machine learning and deep learning algorithms. However, inferring the GRN from scRNA-seq data remains challenging owing to cellular heterogeneity, measurement noise, and data dropout.
In this study, we propose a deep learning model called graph representational learning GRN (GRLGRN) to infer the latent regulatory dependencies between genes based on a prior GRN and data on the profiles of single-cell gene expressions. GRLGRN uses a graph transformer network to extract implicit links from the prior GRN, and encodes the features of genes by using both an adjacency matrix of implicit links and a matrix of the profile of gene expression. Moreover, it uses attention mechanisms to improve feature extraction, and feeds the refined gene embeddings into an output module to infer gene regulatory relationships. To evaluate the performance of GRLGRN, we compared it with prevalent models and performed ablation experiments on seven cell-line datasets with three ground-truth networks. The results showed that GRLGRN achieved the best predictions in AUROC and AUPRC on 78.6% and 80.9% of the datasets, and achieved an average improvement of 7.3% in AUROC and 30.7% in AUPRC. The interpretation discussion and the network visualization were conducted.
The experimental results and case studies illustrate the considerable performance of GRLGRN in predicting gene interactions and provide interpretability for the prediction tasks, such as identifying hub genes in the network and uncovering implicit links.
基因调控网络(GRN)是一种图级表示,用于描述细胞中转录因子与靶基因之间的调控关系。GRN的重建有助于研究细胞动力学、药物设计和代谢系统,而单细胞RNA测序(scRNA-seq)技术的快速发展为GRN的重建提供了重要机遇,但也带来了重大挑战。近年来,基于传统机器学习和深度学习算法提出了许多推断GRN的方法。然而,由于细胞异质性、测量噪声和数据缺失,从scRNA-seq数据推断GRN仍然具有挑战性。
在本研究中,我们提出了一种名为图表示学习GRN(GRLGRN)的深度学习模型,用于基于先验GRN和单细胞基因表达谱数据推断基因之间潜在的调控依赖性。GRLGRN使用图变换器网络从先验GRN中提取隐式链接,并通过隐式链接的邻接矩阵和基因表达谱矩阵对基因特征进行编码。此外,它使用注意力机制来改进特征提取,并将精炼后的基因嵌入输入到输出模块中以推断基因调控关系。为了评估GRLGRN的性能,我们将其与流行模型进行比较,并在具有三个真实网络的七个细胞系数据集上进行了消融实验。结果表明,GRLGRN在78.6%的数据集上的AUROC和80.9%的数据集上的AUPRC中实现了最佳预测,在AUROC中平均提高了7.3%,在AUPRC中平均提高了30.7%。进行了解释讨论和网络可视化。
实验结果和案例研究说明了GRLGRN在预测基因相互作用方面的显著性能,并为预测任务提供了解释性,例如识别网络中的枢纽基因和揭示隐式链接。