用于无配对样本的差异基因调控网络推断的双重最优传输

Double optimal transport for differential gene regulatory network inference with unpaired samples.

作者信息

Li Mengyu, Zhu Bencong, Meng Cheng, Fan Xiaodan

机构信息

Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China.

Department of Statistics, The Chinese University of Hong Kong, Hong Kong 999077, China.

出版信息

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf352.

Abstract

MOTIVATION

Inferring differential gene regulatory networks (GRNs) between different conditions from gene expression profiles remains a significant challenge. Current GRN inference approaches are limited by either scalability in large networks or accuracy in high-dimensional scenarios. Furthermore, most existing methods require paired samples for comparative GRN analyses.

RESULTS

To overcome these challenges, we model gene regulation as a distribution transportation problem and propose an efficient and effective method, called double optimal transport (OT), for reconstructing differential GRNs from the perspective of optimal transport theory, applicable to unpaired samples. Double OT is a novel two-level OT framework. It first aligns unpaired samples by solving a partial OT problem at the sample level, and then infers GRNs from the aligned samples by solving a robust OT problem at the gene level. Comprehensive simulation studies demonstrate the superior efficiency and efficacy of double OT in different scales of networks compared to state-of-the-art methods. We also apply the proposed method to a gastric cancer dataset, identifying the proto-oncogene MET as a central node in the gastric cancer GRN. Its crucial role in early oncogenesis and potential as a therapeutic target further validate our approach and enhance our understanding of the regulatory mechanisms of gastric cancer.

AVAILABILITY AND IMPLEMENTATION

A Python library that implements the proposed method is available at https://github.com/Mengyu8042/ot-grn.

摘要

动机

从基因表达谱推断不同条件之间的差异基因调控网络(GRN)仍然是一项重大挑战。当前的GRN推断方法受到大型网络中可扩展性或高维场景中准确性的限制。此外,大多数现有方法需要配对样本进行比较GRN分析。

结果

为了克服这些挑战,我们将基因调控建模为一个分布运输问题,并提出了一种高效有效的方法,称为双重最优运输(OT),从最优运输理论的角度重建差异GRN,适用于未配对样本。双重OT是一种新颖的两级OT框架。它首先通过在样本级别解决部分OT问题来对齐未配对样本,然后通过在基因级别解决鲁棒OT问题从对齐的样本中推断GRN。综合模拟研究表明,与现有方法相比,双重OT在不同规模网络中具有更高的效率和效能。我们还将所提出的方法应用于一个胃癌数据集,确定原癌基因MET是胃癌GRN中的一个中心节点。它在早期肿瘤发生中的关键作用以及作为治疗靶点的潜力进一步验证了我们的方法,并增强了我们对胃癌调控机制的理解。

可用性和实现

可在https://github.com/Mengyu8042/ot-grn获得实现所提出方法的Python库。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fea0/12342166/88f9966419a6/btaf352f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索