Just Winfried
Department of Mathematics, Ohio University, Athens, OH 45701, USA.
Ann N Y Acad Sci. 2007 Dec;1115:142-53. doi: 10.1196/annals.1407.008. Epub 2007 Oct 9.
Data Sets used in reverse engineering of biochemical networks contain usually relatively few high-dimensional data points, which makes the problem in general vastly underdetermined. It is therefore important to estimate the probability that a given algorithm will return a model of acceptable quality when run on a data set of small size but high dimension. We propose a mathematical framework for investigating such questions. We then demonstrate that without assuming any prior biological knowledge, in general no theoretical distinction between the performance of different algorithms can be made. We also give an example of how expected algorithm performance can in principle be altered by utilizing certain features of the data collection protocol. We conclude with some examples of theorems that were proven within the proposed framework.
用于生化网络逆向工程的数据集通常包含相对较少的高维数据点,这使得总体问题在很大程度上是欠定的。因此,估计给定算法在小尺寸但高维的数据集上运行时返回可接受质量模型的概率非常重要。我们提出了一个用于研究此类问题的数学框架。然后我们证明,在不假设任何先验生物学知识的情况下,一般无法对不同算法的性能进行理论区分。我们还给出了一个示例,说明如何通过利用数据收集协议的某些特征来原则上改变预期的算法性能。我们最后给出了一些在所提出框架内得到证明的定理示例。