Suppr超能文献

PartTree:一种从大量未比对序列构建近似树的算法。

PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences.

作者信息

Katoh Kazutaka, Toh Hiroyuki

机构信息

Digital Medicine Initiative, Kyushu University, Fukuoka 812-8582, Japan.

出版信息

Bioinformatics. 2007 Feb 1;23(3):372-4. doi: 10.1093/bioinformatics/btl592. Epub 2006 Nov 21.

Abstract

MOTIVATION

To construct a multiple sequence alignment (MSA) of a large number (> approximately 10,000) of sequences, the calculation of a guide tree with a complexity of O(N2) to O(N3), where N is the number of sequences, is the most time-consuming process.

RESULTS

To overcome this limitation, we have developed an approximate algorithm, PartTree, to construct a guide tree with an average time complexity of O(N log N). The new MSA method with the PartTree algorithm can align approximately 60,000 sequences in several minutes on a standard desktop computer. The loss of accuracy in MSA caused by this approximation was estimated to be several percent in benchmark tests using Pfam.

AVAILABILITY

The present algorithm has been implemented in the MAFFT sequence alignment package (http://align.bmr.kyushu-u.ac.jp/mafft/software/).

SUPPLEMENTARY INFORMATION

Supplementary information is available at Bioinformatics online.

摘要

动机

为了构建大量(>约10,000)序列的多序列比对(MSA),计算复杂度为O(N2)到O(N3)(其中N是序列数量)的引导树是最耗时的过程。

结果

为克服这一限制,我们开发了一种近似算法PartTree,用于构建平均时间复杂度为O(N log N)的引导树。采用PartTree算法的新MSA方法在标准台式计算机上几分钟内就能比对约60,000个序列。在使用Pfam进行的基准测试中,这种近似导致的MSA准确性损失估计为百分之几。

可用性

当前算法已在MAFFT序列比对软件包(http://align.bmr.kyushu-u.ac.jp/mafft/software/)中实现。

补充信息

补充信息可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验