Wilm Andreas, Linnenbrink Kornelia, Steger Gerhard
Heinrich-Heine-Universität Düsseldorf, Institut für Physikalische Biologie, Universitätsstr, 1, D-40225 Düsseldorf, Germany.
BMC Bioinformatics. 2008 Apr 28;9:219. doi: 10.1186/1471-2105-9-219.
Aligning homologous non-coding RNAs (ncRNAs) correctly in terms of sequence and structure is an unresolved problem, due to both mathematical complexity and imperfect scoring functions. High quality alignments, however, are a prerequisite for most consensus structure prediction approaches, homology searches, and tools for phylogeny inference. Automatically created ncRNA alignments often need manual corrections, yet this manual refinement is tedious and error-prone.
We present an extended version of CONSTRUCT, a semi-automatic, graphical tool suitable for creating RNA alignments correct in terms of both consensus sequence and consensus structure. To this purpose CONSTRUCT combines sequence alignment, thermodynamic data and various measures of covariation. One important feature is that the user is guided during the alignment correction step by a consensus dotplot, which displays all thermodynamically optimal base pairs and the corresponding covariation. Once the initial alignment is corrected, optimal and suboptimal secondary structures as well as tertiary interaction can be predicted. We demonstrate CONSTRUCT's ability to guide the user in correcting an initial alignment, and show an example for optimal secondary consensus structure prediction on very hard to align SECIS elements. Moreover we use CONSTRUCT to predict tertiary interactions from sequences of the internal ribosome entry site of CrP-like viruses. In addition we show that alignments specifically designed for benchmarking can be easily be optimized using CONSTRUCT, although they share very little sequence identity.
CONSTRUCT's graphical interface allows for an easy alignment correction based on and guided by predicted and known structural constraints. It combines several algorithms for prediction of secondary consensus structure and even tertiary interactions. The CONSTRUCT package can be downloaded from the URL listed in the Availability and requirements section of this article.
由于数学复杂性和不完善的评分函数,在序列和结构方面正确比对同源非编码RNA(ncRNA)是一个尚未解决的问题。然而,高质量的比对是大多数共有结构预测方法、同源性搜索和系统发育推断工具的先决条件。自动创建的ncRNA比对通常需要人工校正,但这种人工优化既繁琐又容易出错。
我们展示了CONSTRUCT的扩展版本,这是一个半自动的图形工具,适用于创建在共有序列和共有结构方面都正确的RNA比对。为此,CONSTRUCT结合了序列比对、热力学数据和各种共变度量。一个重要的特点是,在比对校正步骤中,用户会受到一个共有点阵图的引导,该图显示了所有热力学最优碱基对和相应的共变情况。一旦初始比对得到校正,就可以预测最优和次优二级结构以及三级相互作用。我们展示了CONSTRUCT在引导用户校正初始比对方面的能力,并给出了一个在极难比对的硒代半胱氨酸插入序列元件上预测最优二级共有结构的例子。此外,我们使用CONSTRUCT从CrP样病毒的内部核糖体进入位点序列预测三级相互作用。另外,我们表明,专门为基准测试设计的比对,尽管它们的序列同一性很低,但使用CONSTRUCT可以很容易地进行优化。
CONSTRUCT的图形界面允许基于预测的和已知的结构约束并在其引导下轻松进行比对校正。它结合了几种预测二级共有结构甚至三级相互作用的算法。CONSTRUCT软件包可以从本文“可用性和要求”部分列出的URL下载。