Suppr超能文献

用于RNA二级结构预测的高效参数估计

Efficient parameter estimation for RNA secondary structure prediction.

作者信息

Andronescu Mirela, Condon Anne, Hoos Holger H, Mathews David H, Murphy Kevin P

机构信息

Department of Computer Science, University of British Columbia, Vancouver BC V6T 1Z4, Canada.

出版信息

Bioinformatics. 2007 Jul 1;23(13):i19-28. doi: 10.1093/bioinformatics/btm223.

Abstract

MOTIVATION

Accurate prediction of RNA secondary structure from the base sequence is an unsolved computational challenge. The accuracy of predictions made by free energy minimization is limited by the quality of the energy parameters in the underlying free energy model. The most widely used model, the Turner99 model, has hundreds of parameters, and so a robust parameter estimation scheme should efficiently handle large data sets with thousands of structures. Moreover, the estimation scheme should also be trained using available experimental free energy data in addition to structural data.

RESULTS

In this work, we present constraint generation (CG), the first computational approach to RNA free energy parameter estimation that can be efficiently trained on large sets of structural as well as thermodynamic data. Our CG approach employs a novel iterative scheme, whereby the energy values are first computed as the solution to a constrained optimization problem. Then the newly computed energy parameters are used to update the constraints on the optimization function, so as to better optimize the energy parameters in the next iteration. Using our method on biologically sound data, we obtain revised parameters for the Turner99 energy model. We show that by using our new parameters, we obtain significant improvements in prediction accuracy over current state of-the-art methods.

AVAILABILITY

Our CG implementation is available at http://www.rnasoft.ca/CG/.

摘要

动机

从碱基序列准确预测RNA二级结构是一个尚未解决的计算难题。通过自由能最小化进行预测的准确性受到基础自由能模型中能量参数质量的限制。使用最广泛的模型Turner99模型有数百个参数,因此一个稳健的参数估计方案应能有效处理包含数千个结构的大数据集。此外,除了结构数据外,估计方案还应使用可用的实验自由能数据进行训练。

结果

在这项工作中,我们提出了约束生成(CG)方法,这是第一种用于RNA自由能参数估计的计算方法,它可以在大量结构数据和热力学数据上进行有效训练。我们的CG方法采用了一种新颖的迭代方案,即首先将能量值计算为约束优化问题的解。然后,新计算的能量参数用于更新优化函数的约束,以便在下一次迭代中更好地优化能量参数。在合理的生物学数据上使用我们的方法,我们获得了Turner99能量模型的修订参数。我们表明,通过使用我们的新参数,与当前的最先进方法相比,预测准确性有了显著提高。

可用性

我们的CG实现可在http://www.rnasoft.ca/CG/获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验