Van Leemput Koenraad, Van den Bulcke Tim, Dhollander Thomas, De Moor Bart, Marchal Kathleen, van Remortel Piet
ISLab (Intelligent Systems Lab), Universiteit Antwerpen, Antwerpen, Belgium.
Artif Life. 2008 Winter;14(1):49-63. doi: 10.1162/artl.2008.14.1.49.
The development of structure-learning algorithms for gene regulatory networks depends heavily on the availability of synthetic data sets that contain both the original network and associated expression data. This article reports the application of SynTReN, an existing network generator that samples topologies from existing biological networks and uses Michaelis-Menten and Hill enzyme kinetics to simulate gene interactions. We illustrate the effects of different aspects of the expression data on the quality of the inferred network. The tested expression data parameters are network size, network topology, type and degree of noise, quantity of expression data, and interaction types between genes. This is done by applying three well-known inference algorithms to SynTReN data sets. The results show the power of synthetic data in revealing operational characteristics of inference algorithms that are unlikely to be discovered by means of biological microarray data only.
基因调控网络结构学习算法的发展在很大程度上依赖于合成数据集的可用性,这些数据集既包含原始网络又包含相关的表达数据。本文报告了SynTReN的应用,SynTReN是一种现有的网络生成器,它从现有的生物网络中采样拓扑结构,并使用米氏和希尔酶动力学来模拟基因相互作用。我们阐述了表达数据不同方面对推断网络质量的影响。所测试的表达数据参数包括网络大小、网络拓扑、噪声类型和程度、表达数据量以及基因之间的相互作用类型。这是通过将三种著名的推断算法应用于SynTReN数据集来完成的。结果表明,合成数据在揭示推断算法的操作特性方面具有强大作用,而这些特性仅通过生物微阵列数据不太可能被发现。