时滞信息论方法在基因调控网络反向工程中的应用。

Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.

机构信息

School of Computing, The University of Southern Mississippi, MS 39402, USA.

出版信息

BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S19. doi: 10.1186/1471-2105-11-S6-S19.

DOI:10.1186/1471-2105-11-S6-S19

PMID:20946602

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3026366/

Abstract

BACKGROUND

A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters.

RESULTS

It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach.

CONCLUSIONS

To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes.

摘要

背景

过去已经提出了许多用于基因调控网络（GRN）推断的模型和算法；然而，它们都没有解决时间序列微阵列表达数据的大小（以时间点的数量为单位）的影响。在本文中，我们通过分析基于信息理论和动态贝叶斯网络（DBN）模型的三种算法的行为来研究这个问题。这些算法在由合成网络生成的不同大小的数据上实现。实验表明，这些算法的推断准确性在成对互信息（MI）度量达到饱和点后达到饱和点，因此基于信息理论的方案的推断准确性存在理论限制，该限制取决于用于推断 GRN 的微阵列数据的时间点的数量。这说明了 MI 可能不是用于 GRN 推断算法的最佳度量标准的事实。为了规避 MI 度量的限制，我们引入了一种计算任意两个基因之间的时间滞后的新方法，并提出了成对时间滞后互信息（TLMI）和时间滞后条件互信息（TLCMI）度量。接下来，我们使用这些新的度量标准来提出新的 GRN 推断方案，该方案基于精度和召回参数提供更高的推断准确性。

结果

观察到，在特定数量的时间点（即特定大小）的微阵列数据之后，算法的性能以召回率-精度比来衡量，由于随着数据大小的增加，计算出的成对 MI 度量的饱和，该性能达到饱和。将提出的算法与四种不同生物网络上的现有方法进行了比较。根据基准精度和召回率度量标准评估得到的网络，结果有利于我们的方法。

结论

为了减轻基于信息理论的 GRN 推断算法对数据大小的影响，已经提出了基于新颖的时间滞后的信息论方法来推断基因调控网络。结果表明，任何两个基因之间的调节效应的时间滞后在 GRN 推断方案中起着重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c96/3026366/87f2f2c0e1ea/1471-2105-11-S6-S19-1.jpg

相似文献

Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.时滞信息论方法在基因调控网络反向工程中的应用。

BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S19. doi: 10.1186/1471-2105-11-S6-S19.

MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT：一种使用时间序列基因表达数据推断基因调控网络的新算法。

BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.

A novel gene network inference algorithm using predictive minimum description length approach.一种使用预测性最小描述长度方法的新型基因网络推理算法。

BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S7. doi: 10.1186/1752-0509-4-S1-S7.

Learning the structure of gene regulatory networks from time series gene expression data.从时间序列基因表达数据中学习基因调控网络的结构。

BMC Genomics. 2011 Dec 23;12 Suppl 5(Suppl 5):S13. doi: 10.1186/1471-2164-12-S5-S13.

Inference of gene networks from gene expression time series using recurrent neural networks and sparse MAP estimation.使用递归神经网络和稀疏最大后验估计从基因表达时间序列推断基因网络。

J Bioinform Comput Biol. 2018 Aug;16(4):1850009. doi: 10.1142/S0219720018500099. Epub 2018 Apr 26.

Inference of Gene Regulatory Network Based on Local Bayesian Networks.基于局部贝叶斯网络的基因调控网络推理

PLoS Comput Biol. 2016 Aug 1;12(8):e1005024. doi: 10.1371/journal.pcbi.1005024. eCollection 2016 Aug.

Inferring connectivity of genetic regulatory networks using information-theoretic criteria.使用信息论标准推断遗传调控网络的连通性。

IEEE/ACM Trans Comput Biol Bioinform. 2008 Apr-Jun;5(2):262-74. doi: 10.1109/TCBB.2007.1067.

Reverse engineering module networks by PSO-RNN hybrid modeling.通过粒子群优化-递归神经网络混合建模对模块网络进行逆向工程。

BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S15. doi: 10.1186/1471-2164-10-S1-S15.

TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach.时滞 ARACNE：基于信息论方法从时间序列数据中反向工程基因网络。

BMC Bioinformatics. 2010 Mar 25;11:154. doi: 10.1186/1471-2105-11-154.

Gene regulation network inference using k-nearest neighbor-based mutual information estimation: revisiting an old DREAM.基于 k-最近邻互信息估计的基因调控网络推断：重新审视一个旧的 DREAM。

BMC Bioinformatics. 2023 Mar 6;24(1):84. doi: 10.1186/s12859-022-05047-5.

引用本文的文献

GramSeq-DTA: A Grammar-Based Drug-Target Affinity Prediction Approach Fusing Gene Expression Information.GramSeq-DTA：一种融合基因表达信息的基于语法的药物-靶点亲和力预测方法。

Biomolecules. 2025 Mar 12;15(3):405. doi: 10.3390/biom15030405.

Heterogeneous Clustering of Multiomics Data for Breast Cancer Subgroup Classification and Detection.用于乳腺癌亚组分类和检测的多组学数据的异质性聚类

Int J Mol Sci. 2025 Feb 17;26(4):1707. doi: 10.3390/ijms26041707.

CORTADO: Hill Climbing Optimization for Cell-Type Specific Marker Gene Discovery.科尔塔多：用于细胞类型特异性标记基因发现的爬山优化算法

bioRxiv. 2024 Dec 23:2024.12.23.630040. doi: 10.1101/2024.12.23.630040.

COFFEE: consensus single cell-type specific inference for gene regulatory networks.咖啡：用于基因调控网络的共识单细胞特异性推断。

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae457.

CHAI: consensus clustering through similarity matrix integration for cell-type identification.CHAI：通过相似性矩阵集成进行共识聚类，以进行细胞类型识别。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae411.

COFFEE: Consensus Single Cell-Type Specific Inference for Gene Regulatory Networks.COFFEE：基因调控网络的共识单细胞类型特异性推断

bioRxiv. 2024 Jan 8:2024.01.05.574445. doi: 10.1101/2024.01.05.574445.

An order independent algorithm for inferring gene regulatory network using quantile value for conditional independence tests.一种使用分位数值进行条件独立性检验的推断基因调控网络的与顺序无关的算法。

Sci Rep. 2021 Apr 7;11(1):7605. doi: 10.1038/s41598-021-87074-5.

Co-Expression Networks for Causal Gene Identification Based on RNA-Seq Data of .基于. 的 RNA-Seq 数据的因果基因识别的共表达网络

Genes (Basel). 2020 Jul 14;11(7):794. doi: 10.3390/genes11070794.

PFBNet: a priori-fused boosting method for gene regulatory network inference.PFBNet：一种用于基因调控网络推断的先验融合提升方法。

BMC Bioinformatics. 2020 Jul 14;21(1):308. doi: 10.1186/s12859-020-03639-7.

Evaluation of the Common Molecular Basis in Alzheimer's and Parkinson's Diseases.评估阿尔茨海默病和帕金森病的常见分子基础。

Int J Mol Sci. 2019 Jul 30;20(15):3730. doi: 10.3390/ijms20153730.

本文引用的文献

Revealing strengths and weaknesses of methods for gene network inference.揭示基因网络推断方法的优缺点。

Proc Natl Acad Sci U S A. 2010 Apr 6;107(14):6286-91. doi: 10.1073/pnas.0913357107. Epub 2010 Mar 22.

Towards a rigorous assessment of systems biology models: the DREAM3 challenges.迈向系统生物学模型的严格评估：DREAM3 挑战。

PLoS One. 2010 Feb 23;5(2):e9202. doi: 10.1371/journal.pone.0009202.

Generating realistic in silico gene networks for performance assessment of reverse engineering methods.生成用于逆向工程方法性能评估的逼真的计算机模拟基因网络。

J Comput Biol. 2009 Feb;16(2):229-39. doi: 10.1089/cmb.2008.09TT.

Gene regulatory network inference: data integration in dynamic models-a review.基因调控网络推断：动态模型中的数据整合——综述

Biosystems. 2009 Apr;96(1):86-103. doi: 10.1016/j.biosystems.2008.12.004. Epub 2008 Dec 27.

Inferring connectivity of genetic regulatory networks using information-theoretic criteria.使用信息论标准推断遗传调控网络的连通性。

IEEE/ACM Trans Comput Biol Bioinform. 2008 Apr-Jun;5(2):262-74. doi: 10.1109/TCBB.2007.1067.

Inference of gene regulatory networks based on a universal minimum description length.基于通用最小描述长度的基因调控网络推理

EURASIP J Bioinform Syst Biol. 2008;2008(1):482090. doi: 10.1155/2008/482090.

KEGG for linking genomes to life and the environment.京都基因与基因组百科全书，用于将基因组与生命及环境相联系。

Nucleic Acids Res. 2008 Jan;36(Database issue):D480-4. doi: 10.1093/nar/gkm882. Epub 2007 Dec 12.

Inferring gene regulatory networks from time series data using the minimum description length principle.利用最小描述长度原理从时间序列数据推断基因调控网络。

Bioinformatics. 2006 Sep 1;22(17):2129-35. doi: 10.1093/bioinformatics/btl364. Epub 2006 Jul 15.

ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.ARACNE：一种用于在哺乳动物细胞环境中重建基因调控网络的算法。

BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-7-S1-S7.

From genomics to chemical genomics: new developments in KEGG.从基因组学到化学基因组学：KEGG的新进展

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D354-7. doi: 10.1093/nar/gkj102.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

时滞信息论方法在基因调控网络反向工程中的应用。

Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献