di Bernardo Diego, Down Thomas, Hubbard Tim
Telethon Institute of Genetics and Medicine, Via P Castellino 111, 80133 Naples, Italy.
Bioinformatics. 2003 Sep 1;19(13):1606-11. doi: 10.1093/bioinformatics/btg229.
Structured non-coding RNAs (ncRNAs) have a very important functional role in the cell. No distinctive general features common to all ncRNA have yet been discovered. This makes it difficult to design computational tools able to detect novel ncRNAs in the genomic sequence.
We devised an algorithm able to detect conserved secondary structures in both pairwise and multiple DNA sequence alignments with computational time proportional to the square of the sequence length. We implemented the algorithm for the case of pairwise and three-way alignments and tested it on ncRNAs obtained from public databases. On the test sets, the pairwise algorithm has a specificity greater than 97% with a sensitivity varying from 22.26% for Blast alignments to 56.35% for structural alignments. The three-way algorithm behaves similarly. Our algorithm is able to efficiently detect a conserved secondary structure in multiple alignments.
结构化非编码RNA(ncRNA)在细胞中具有非常重要的功能作用。尚未发现所有ncRNA共有的独特一般特征。这使得设计能够在基因组序列中检测新型ncRNA的计算工具变得困难。
我们设计了一种算法,该算法能够在成对和多个DNA序列比对中检测保守的二级结构,计算时间与序列长度的平方成正比。我们针对成对和三向比对的情况实现了该算法,并在从公共数据库获得的ncRNA上进行了测试。在测试集上,成对算法的特异性大于97%,灵敏度从Blast比对的22.26%到结构比对的56.35%不等。三向算法表现类似。我们的算法能够有效地在多个比对中检测保守的二级结构。