Suppr超能文献

序列比对中随机参数和非参数屏蔽可以得到改进,并产生更好分辨率的树。

Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

机构信息

Zoologisches Forschungsmuseum A, Koenig, Adenauerallee 160, 53113 Bonn, Germany.

出版信息

Front Zool. 2010 Mar 31;7:10. doi: 10.1186/1742-9994-7-10.

Abstract

BACKGROUND

Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective.

RESULTS

ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict.

CONCLUSIONS

Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.

摘要

背景

对齐掩蔽方法是指在重建树之前排除对齐块的技术,它已成功地提高了序列比对中的信噪比。然而,由于缺乏正式定义的方法来识别序列比对中的随机性,因此无法常规应用对齐掩蔽。在这项研究中,我们比较了最常用的剖析方法(GBLOCKS)与一种新的剖析方法(ALISCORE)的效果,GBLOCKS 方法使用预定义的规则组合与对齐掩蔽,而 ALISCORE 方法基于蒙特卡罗在滑动窗口内进行重采样。我们使用不同的数据和比对方法进行了比较。GBLOCKS 方法排除了超过某个阈值的变量部分,而阈值的选择是任意的,而 ALISCORE 算法没有先验的参数空间评分,因此更客观。

结果

我们成功地将 ALISCORE 扩展到了氨基酸,使用比例模型和经验取代矩阵来对多序列比对中的随机性进行评分。复杂的自举重采样导致随机相似序列的评分均匀分布,以评估观察到的序列相似性的随机性。在真实数据上进行测试性能时,GBLOCKS 和 ALISCORE 这两种掩蔽方法都有助于提高树的分辨率。滑动窗口方法对相同数据集的不同比对不敏感,并且在所有数据集上都表现良好。同时,ALISCORE 能够处理不同的取代模式和异质碱基组成。所有数据集的最佳性能都是由最宽松的 GBLOCKS 间隙参数设置和 ALISCORE 实现的。相应地,邻居网络分析显示冲突减少最多。

结论

对齐掩蔽可以提高系统发生重建前多序列比对中的信噪比。鉴于对齐剖析的稳健性能,应常规使用对齐掩蔽来改善树重建。对齐剖析的参数方法可以很容易地扩展到更复杂的基于似然的序列进化模型,这为进一步的改进提供了可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9516/2867768/7dab7286e05b/1742-9994-7-10-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验