Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China.
University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae361.
Constructing accurate gene regulatory network s (GRNs), which reflect the dynamic governing process between genes, is critical to understanding the diverse cellular process and unveiling the complexities in biological systems. With the development of computer sciences, computational-based approaches have been applied to the GRNs inference task. However, current methodologies face challenges in effectively utilizing existing topological information and prior knowledge of gene regulatory relationships, hindering the comprehensive understanding and accurate reconstruction of GRNs. In response, we propose a novel graph neural network (GNN)-based Multi-Task Learning framework for GRN reconstruction, namely MTLGRN. Specifically, we first encode the gene promoter sequences and the gene biological features and concatenate the corresponding feature representations. Then, we construct a multi-task learning framework including GRN reconstruction, Gene knockout predict, and Gene expression matrix reconstruction. With joint training, MTLGRN can optimize the gene latent representations by integrating gene knockout information, promoter characteristics, and other biological attributes. Extensive experimental results demonstrate superior performance compared with state-of-the-art baselines on the GRN reconstruction task, efficiently leveraging biological knowledge and comprehensively understanding the gene regulatory relationships. MTLGRN also pioneered attempts to simulate gene knockouts on bulk data by incorporating gene knockout information.
构建准确的基因调控网络(GRNs),反映基因之间的动态调控过程,对于理解多样化的细胞过程和揭示生物系统的复杂性至关重要。随着计算机科学的发展,基于计算的方法已被应用于 GRNs 推断任务。然而,当前的方法在有效地利用现有拓扑信息和基因调控关系的先验知识方面面临挑战,阻碍了对 GRNs 的全面理解和准确重建。有鉴于此,我们提出了一种基于图神经网络(GNN)的用于 GRN 重建的多任务学习框架,即 MTLGRN。具体来说,我们首先对基因启动子序列和基因生物学特征进行编码,并将相应的特征表示进行串联。然后,我们构建了一个多任务学习框架,包括 GRN 重建、基因敲除预测和基因表达矩阵重建。通过联合训练,MTLGRN 可以通过整合基因敲除信息、启动子特征和其他生物学属性来优化基因潜在表示。在 GRN 重建任务上的大量实验结果表明,与最先进的基线相比,该方法具有优越的性能,能够有效地利用生物学知识并全面理解基因调控关系。MTLGRN 还开创性地尝试通过纳入基因敲除信息在批量数据上模拟基因敲除。