Allen Edward E, Norris James L, John David J, Thomas Stan J, Turkett William H, Fetrow Jacquelyn S
Department of Mathematics, Wake Forest University, Winston-Salem, NC 27109.
Department of Computer Science, Wake Forest University, Winston-Salem, NC 27109.
Proc IEEE Int Symp Bioinformatics Bioeng. 2010 May-Jun;2010:79-85. doi: 10.1109/BIBE.2010.21. Epub 2010 Jul 26.
Multiple approaches for reverse-engineering biological networks from time-series data have been proposed in the computational biology literature. These approaches can be classified by their underlying mathematical algorithms, such as Bayesian or algebraic techniques, as well as by their time paradigm, which includes next-state and co-temporal modeling. The types of biological relationships, such as parent-child or siblings, discovered by these algorithms are quite varied. It is important to understand the strengths and weaknesses of the various algorithms and time paradigms on actual experimental data. We assess how well the co-temporal implementations of three algorithms, continuous Bayesian, discrete Bayesian, and computational algebraic, can 1) identify two types of entity relationships, parent and sibling, between biological entities, 2) deal with experimental sparse time course data, and 3) handle experimental noise seen in replicate data sets. These algorithms are evaluated, using the shuffle index metric, for how well the resulting models match literature models in terms of siblings and parent relationships. Results indicate that all three co-temporal algorithms perform well, at a statistically significant level, at finding sibling relationships, but perform relatively poorly in finding parent relationships.
计算生物学文献中已经提出了多种从时间序列数据反向构建生物网络的方法。这些方法可以根据其基础数学算法进行分类,如贝叶斯或代数技术,也可以根据其时序范式进行分类,其中包括下一状态和共时建模。这些算法所发现的生物关系类型,如亲子或兄弟姐妹关系,差异很大。了解各种算法和时序范式在实际实验数据上的优缺点很重要。我们评估了三种算法(连续贝叶斯算法、离散贝叶斯算法和计算代数算法)的共时实现方式在以下方面的表现:1)识别生物实体之间的两种实体关系,即亲子关系和兄弟姐妹关系;2)处理实验性稀疏时间进程数据;3)处理重复数据集中出现的实验噪声。使用洗牌指数度量来评估这些算法,看其生成的模型在兄弟姐妹关系和亲子关系方面与文献模型的匹配程度。结果表明,所有三种共时算法在发现兄弟姐妹关系方面均表现良好,具有统计学显著性,但在发现亲子关系方面表现相对较差。