Suppr超能文献

时滞信息论方法在基因调控网络反向工程中的应用。

Time lagged information theoretic approaches to the reverse engineering of gene regulatory networks.

机构信息

School of Computing, The University of Southern Mississippi, MS 39402, USA.

出版信息

BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S19. doi: 10.1186/1471-2105-11-S6-S19.

Abstract

BACKGROUND

A number of models and algorithms have been proposed in the past for gene regulatory network (GRN) inference; however, none of them address the effects of the size of time-series microarray expression data in terms of the number of time-points. In this paper, we study this problem by analyzing the behaviour of three algorithms based on information theory and dynamic Bayesian network (DBN) models. These algorithms were implemented on different sizes of data generated by synthetic networks. Experiments show that the inference accuracy of these algorithms reaches a saturation point after a specific data size brought about by a saturation in the pair-wise mutual information (MI) metric; hence there is a theoretical limit on the inference accuracy of information theory based schemes that depends on the number of time points of micro-array data used to infer GRNs. This illustrates the fact that MI might not be the best metric to use for GRN inference algorithms. To circumvent the limitations of the MI metric, we introduce a new method of computing time lags between any pair of genes and present the pair-wise time lagged Mutual Information (TLMI) and time lagged Conditional Mutual Information (TLCMI) metrics. Next we use these new metrics to propose novel GRN inference schemes which provides higher inference accuracy based on the precision and recall parameters.

RESULTS

It was observed that beyond a certain number of time-points (i.e., a specific size) of micro-array data, the performance of the algorithms measured in terms of the recall-to-precision ratio saturated due to the saturation in the calculated pair-wise MI metric with increasing data size. The proposed algorithms were compared to existing approaches on four different biological networks. The resulting networks were evaluated based on the benchmark precision and recall metrics and the results favour our approach.

CONCLUSIONS

To alleviate the effects of data size on information theory based GRN inference algorithms, novel time lag based information theoretic approaches to infer gene regulatory networks have been proposed. The results show that the time lags of regulatory effects between any pair of genes play an important role in GRN inference schemes.

摘要

背景

过去已经提出了许多用于基因调控网络(GRN)推断的模型和算法;然而,它们都没有解决时间序列微阵列表达数据的大小(以时间点的数量为单位)的影响。在本文中,我们通过分析基于信息理论和动态贝叶斯网络(DBN)模型的三种算法的行为来研究这个问题。这些算法在由合成网络生成的不同大小的数据上实现。实验表明,这些算法的推断准确性在成对互信息(MI)度量达到饱和点后达到饱和点,因此基于信息理论的方案的推断准确性存在理论限制,该限制取决于用于推断 GRN 的微阵列数据的时间点的数量。这说明了 MI 可能不是用于 GRN 推断算法的最佳度量标准的事实。为了规避 MI 度量的限制,我们引入了一种计算任意两个基因之间的时间滞后的新方法,并提出了成对时间滞后互信息(TLMI)和时间滞后条件互信息(TLCMI)度量。接下来,我们使用这些新的度量标准来提出新的 GRN 推断方案,该方案基于精度和召回参数提供更高的推断准确性。

结果

观察到,在特定数量的时间点(即特定大小)的微阵列数据之后,算法的性能以召回率-精度比来衡量,由于随着数据大小的增加,计算出的成对 MI 度量的饱和,该性能达到饱和。将提出的算法与四种不同生物网络上的现有方法进行了比较。根据基准精度和召回率度量标准评估得到的网络,结果有利于我们的方法。

结论

为了减轻基于信息理论的 GRN 推断算法对数据大小的影响,已经提出了基于新颖的时间滞后的信息论方法来推断基因调控网络。结果表明,任何两个基因之间的调节效应的时间滞后在 GRN 推断方案中起着重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c96/3026366/87f2f2c0e1ea/1471-2105-11-S6-S19-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验