Almudevar Anthony
Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Rochester, NY 14642, USA.
EURASIP J Bioinform Syst Biol. 2009;2009(1):878013. doi: 10.1155/2009/878013. Epub 2010 Mar 4.
Reconstruction of gene regulatory networks based on experimental data usually relies on statistical evidence, necessitating the choice of a statistical threshold which defines a significant biological effect. Approaches to this problem found in the literature range from rigorous multiple testing procedures to ad hoc P-value cut-off points. However, when the data implies graphical structure, it should be possible to exploit this feature in the threshold selection process. In this article we propose a procedure based on this principle. Using coding theory we devise a measure of graphical structure, for example, highly connected nodes or chain structure. The measure for a particular graph can be compared to that of a random graph and structure inferred on that basis. By varying the statistical threshold the maximum deviation from random structure can be estimated, and the threshold is then chosen on that basis. A global test for graph structure follows naturally.
基于实验数据重建基因调控网络通常依赖于统计证据,因此需要选择一个定义显著生物学效应的统计阈值。文献中针对此问题的方法从严格的多重检验程序到临时的P值截止点不等。然而,当数据隐含图形结构时,应该可以在阈值选择过程中利用这一特征。在本文中,我们提出了一种基于此原理的程序。利用编码理论,我们设计了一种图形结构度量方法,例如,高度连接的节点或链结构。可以将特定图形的度量与随机图形的度量进行比较,并在此基础上推断结构。通过改变统计阈值,可以估计与随机结构的最大偏差,然后据此选择阈值。图形结构的全局检验自然随之而来。