Akmaev V R, Kelley S T, Stormo G D
Dept. of Molecular, Cellular and Developmental Biology, University of Colorado, Boulder 80309-0347, USA.
Proc Int Conf Intell Syst Mol Biol. 1999:10-7.
Methods based on the Mutual Information statistic (MI methods) predict structure by looking for statistical correlations between sequence positions in a set of aligned sequences. Although MI methods are often quite effective, these methods ignore the underlying phylogenetic relationships of the sequences they analyze. Thus, they cannot distinguish between correlations due to structural interactions, and spurious correlations resulting from phylogenetic history. In this paper, we introduce a method analogous to MI that incorporates phylogenetic information. We show that this method accurately recovers the structures of well-known RNA molecules. We also demonstrate, with both real and simulated data, that this phylogenetically-based method outperforms standard MI methods, and improves the ability to distinguish interacting from non-interacting positions in RNA. This method is flexible, and may be applied to the prediction of protein structure given the appropriate evolutionary model. Because this method incorporates phylogenetic data, it also has the potential to be improved with the addition of more accurate phylogenetic information, although we show that even approximate phylogenies are helpful.
基于互信息统计的方法(MI方法)通过寻找一组比对序列中序列位置之间的统计相关性来预测结构。尽管MI方法通常相当有效,但这些方法忽略了它们所分析序列的潜在系统发育关系。因此,它们无法区分由于结构相互作用产生的相关性和由系统发育历史导致的虚假相关性。在本文中,我们引入了一种类似于MI的方法,该方法纳入了系统发育信息。我们表明,这种方法能够准确地恢复著名RNA分子的结构。我们还通过真实数据和模拟数据证明,这种基于系统发育的方法优于标准MI方法,并提高了区分RNA中相互作用位置和非相互作用位置的能力。这种方法很灵活,在给定适当进化模型的情况下可应用于蛋白质结构预测。由于该方法纳入了系统发育数据,尽管我们表明即使是近似的系统发育关系也有帮助,但它也有可能通过添加更准确的系统发育信息得到改进。