Simossis V A, Kleinjung J, Heringa J
Bioinformatics Section, Faculty of Sciences, Vrije Universiteit De Boelelaan 1081A, 1081 HV, Amsterdam, The Netherlands.
Nucleic Acids Res. 2005 Feb 7;33(3):816-24. doi: 10.1093/nar/gki233. Print 2005.
We present a profile-profile multiple alignment strategy that uses database searching to collect homologues for each sequence in a given set, in order to enrich their available evolutionary information for the alignment. For each of the alignment sequences, the putative homologous sequences that score above a pre-defined threshold are incorporated into a position-specific pre-alignment profile. The enriched position-specific profile is used for standard progressive alignment, thereby more accurately describing the characteristic features of the given sequence set. We show that owing to the incorporation of the pre-alignment information into a standard progressive multiple alignment routine, the alignment quality between distant sequences increases significantly and outperforms state-of-the-art methods, such as T-COFFEE and MUSCLE. We also show that although entirely sequence-based, our novel strategy is better at aligning distant sequences when compared with a recent contact-based alignment method. Therefore, our pre-alignment profile strategy should be advantageous for applications that rely on high alignment accuracy such as local structure prediction, comparative modelling and threading.
我们提出了一种轮廓-轮廓多重比对策略,该策略使用数据库搜索为给定序列集中的每个序列收集同源物,以便为比对丰富其可用的进化信息。对于每个比对序列,得分高于预定义阈值的推定同源序列被纳入特定位置的预比对轮廓。富集的特定位置轮廓用于标准渐进比对,从而更准确地描述给定序列集的特征。我们表明,由于将预比对信息纳入标准渐进多重比对程序,远缘序列之间的比对质量显著提高,并且优于诸如T-COFFEE和MUSCLE等现有方法。我们还表明,尽管我们的新策略完全基于序列,但与最近基于接触的比对方法相比,在比对远缘序列方面表现更好。因此,我们的预比对轮廓策略对于依赖高比对精度的应用(如局部结构预测、比较建模和穿线法)应该是有利的。