Suppr超能文献

基于比对的系统发育推断的具有多种应用感知优化标准的 PASTA。

PASTA with many application-aware optimization criteria for alignment based phylogeny inference.

机构信息

Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.

出版信息

Comput Biol Chem. 2022 Jun;98:107661. doi: 10.1016/j.compbiolchem.2022.107661. Epub 2022 Mar 14.

Abstract

Multiple sequence alignment (MSA) is a prerequisite for several analyses in bioinformatics, such as, phylogeny estimation, protein structure prediction, etc. PASTA (Practical Alignments using SATé and TrAnsitivity) is a state-of-the-art method for computing MSAs, well-known for its accuracy and scalability. It iteratively co-estimates both MSA and maximum likelihood (ML) phylogenetic tree. It attempts to exploit the close association between the accuracy of an MSA and the corresponding tree while finding the output through multiple iterations from both directions. Currently, PASTA uses the ML score as its optimization criterion which is a good score in phylogeny estimation but cannot be proven as a necessary and sufficient criterion to produce an accurate phylogenetic tree. Therefore, the integration of multiple application-aware objectives into PASTA, which are carefully chosen considering their better association to the tree accuracy, may potentially have a profound positive impact on its performance. This paper has employed four application-aware objectives alongside ML score to develop a multi-objective (MO) framework, namely, PMAO that leverages PASTA to generate a bunch of high-quality solutions that are considered equivalent in the context of conflicting objectives under consideration. our experimental analysis on a popular biological benchmark reveals that the tree-space generated by PMAO contains significantly better trees than stand-alone PASTA. To help the domain experts further in choosing the most appropriate tree from the PMAO output (containing a relatively large set of high-quality solutions), we have added an additional component within the PMAO framework that is capable of generating a smaller set of high-quality solutions. Finally, we have attempted to obtain a single high-quality solution without using any external evidences and have found that summarizing the few solutions detected through the above component can serve this purpose to some extent.

摘要

多序列比对 (MSA) 是生物信息学中多个分析的前提,例如系统发育估计、蛋白质结构预测等。PASTA(使用 SATé 和传递性进行实用比对)是一种用于计算 MSA 的最先进方法,以其准确性和可扩展性而闻名。它迭代地共同估计 MSA 和最大似然 (ML) 系统发育树。它试图在从两个方向进行多次迭代的过程中利用 MSA 的准确性与其对应的树之间的紧密联系来找到输出。目前,PASTA 使用 ML 得分作为其优化标准,该标准在系统发育估计中是一个很好的得分,但不能被证明是产生准确系统发育树的必要和充分标准。因此,将多个应用感知目标集成到 PASTA 中,考虑到它们与树准确性的更好关联而精心选择,可能会对其性能产生深远的积极影响。本文采用了四个应用感知目标与 ML 得分一起开发了一个多目标 (MO) 框架,即 PMAO,它利用 PASTA 生成了一组高质量的解决方案,这些解决方案在考虑的冲突目标背景下被认为是等效的。我们在流行的生物学基准上的实验分析表明,PMAO 生成的树空间包含明显更好的树,而不是独立的 PASTA。为了帮助领域专家进一步从 PMAO 输出(包含相对较大的高质量解决方案集)中选择最合适的树,我们在 PMAO 框架中添加了一个额外的组件,该组件能够生成一组较小的高质量解决方案。最后,我们试图在不使用任何外部证据的情况下获得单个高质量解决方案,并发现通过上述组件检测到的少数解决方案的总结在某种程度上可以达到此目的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验