SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA.
BMC Bioinformatics. 2018 Feb 14;19(1):53. doi: 10.1186/s12859-018-2050-4.
Completion of genome-scale flux-balance models using computational reaction gap-filling is a widely used approach, but its accuracy is not well known.
We report on computational experiments of reaction gap filling in which we generated degraded versions of the EcoCyc-20.0-GEM model by randomly removing flux-carrying reactions from a growing model. We gap-filled the degraded models and compared the resulting gap-filled models with the original model. Gap-filling was performed by the Pathway Tools MetaFlux software using its General Development Mode (GenDev) and its Fast Development Mode (FastDev). We explored 12 GenDev variants including two linear solvers (SCIP and CPLEX) for solving the Mixed Integer Linear Programming (MILP) problems for gap filling; three different sets of linear constraints were applied; and two MILP methods were implemented. We compared these 13 variants according to accuracy, speed, and amount of information returned to the user.
We observed large variation among the performance of the 13 gap-filling variants. Although no variant was best in all dimensions, we found one variant that was fast, accurate, and returned more information to the user. Some gap-filling variants were inaccurate, producing solutions that were non-minimum or invalid (did not enable model growth). The best GenDev variant showed a best average precision of 87% and a best average recall of 61%. FastDev showed an average precision of 71% and an average recall of 59%. Thus, using the most accurate variant, approximately 13% of the gap-filled reactions were incorrect (were not the reactions removed from the model), and 39% of gap-filled reactions were not found, suggesting that curation is still an important aspect of metabolic-model development.
使用计算反应填补来完成基因组规模的通量平衡模型是一种广泛使用的方法,但它的准确性尚不清楚。
我们报告了反应填补的计算实验,其中我们通过从不断增长的模型中随机删除带通量的反应来生成 EcoCyc-20.0-GEM 模型的降解版本。我们填补了缺口模型,并将得到的缺口模型与原始模型进行了比较。缺口填补是通过 Pathway Tools MetaFlux 软件使用其通用开发模式(GenDev)和快速开发模式(FastDev)来完成的。我们探索了 12 种 GenDev 变体,包括两种求解混合整数线性规划(MILP)问题的线性求解器(SCIP 和 CPLEX),用于缺口填补;应用了三组不同的线性约束;并实现了两种 MILP 方法。我们根据准确性、速度和返回给用户的信息量来比较这 13 种变体。
我们观察到 13 种缺口填补变体的性能有很大差异。虽然没有一个变体在所有方面都是最好的,但我们找到了一个快速、准确且向用户返回更多信息的变体。一些缺口填补变体不准确,产生的解决方案是非最小或无效的(不能使模型生长)。最佳的 GenDev 变体显示出最佳平均精度为 87%和最佳平均召回率为 61%。FastDev 显示出平均精度为 71%和平均召回率为 59%。因此,使用最准确的变体,大约 13%的填补反应是不正确的(不是从模型中删除的反应),39%的填补反应未被发现,这表明代谢模型开发仍然是一个重要的方面。