Suppr超能文献

iLSGRN:基于多模型融合的大规模基因调控网络推断。

iLSGRN: inference of large-scale gene regulatory networks based on multi-model fusion.

机构信息

School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China.

Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong 999077, China.

出版信息

Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad619.

Abstract

MOTIVATION

Gene regulatory networks (GRNs) are a way of describing the interaction between genes, which contribute to revealing the different biological mechanisms in the cell. Reconstructing GRNs based on gene expression data has been a central computational problem in systems biology. However, due to the high dimensionality and non-linearity of large-scale GRNs, accurately and efficiently inferring GRNs is still a challenging task.

RESULTS

In this article, we propose a new approach, iLSGRN, to reconstruct large-scale GRNs from steady-state and time-series gene expression data based on non-linear ordinary differential equations. Firstly, the regulatory gene recognition algorithm calculates the Maximal Information Coefficient between genes and excludes redundant regulatory relationships to achieve dimensionality reduction. Then, the feature fusion algorithm constructs a model leveraging the feature importance derived from XGBoost (eXtreme Gradient Boosting) and RF (Random Forest) models, which can effectively train the non-linear ordinary differential equations model of GRNs and improve the accuracy and stability of the inference algorithm. The extensive experiments on different scale datasets show that our method makes sensible improvement compared with the state-of-the-art methods. Furthermore, we perform cross-validation experiments on the real gene datasets to validate the robustness and effectiveness of the proposed method.

AVAILABILITY AND IMPLEMENTATION

The proposed method is written in the Python language, and is available at: https://github.com/lab319/iLSGRN.

摘要

动机

基因调控网络(GRNs)是描述基因之间相互作用的一种方式,有助于揭示细胞中的不同生物学机制。基于基因表达数据重建 GRNs 一直是系统生物学中的一个核心计算问题。然而,由于大规模 GRNs 的高维性和非线性,准确有效地推断 GRNs 仍然是一项具有挑战性的任务。

结果

在本文中,我们提出了一种新的方法 iLSGRN,用于基于非线性常微分方程从稳态和时间序列基因表达数据中重建大规模 GRNs。首先,调节基因识别算法计算基因之间的最大信息系数,并排除冗余的调节关系,以实现降维。然后,特征融合算法构建了一个模型,利用 XGBoost(极端梯度提升)和 RF(随机森林)模型得出的特征重要性,该模型可以有效地训练 GRNs 的非线性常微分方程模型,并提高推断算法的准确性和稳定性。在不同规模数据集上的广泛实验表明,与最先进的方法相比,我们的方法有了明显的改进。此外,我们在真实基因数据集上进行了交叉验证实验,以验证所提出方法的稳健性和有效性。

可用性和实现

所提出的方法是用 Python 语言编写的,并可在以下网址获得:https://github.com/lab319/iLSGRN。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5f5/10589915/3bba4c8240b4/btad619f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验