Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University Poznan, Uniwersytetu Poznańskiego 6, 61-614, Poznan, Poland.
Tandy School of Computer Science, The University of Tulsa, 800 South Tucker Drive, Tulsa, OK, 74104, USA.
Genome Biol. 2019 Jul 25;20(1):144. doi: 10.1186/s13059-019-1755-7.
Alignment-free (AF) sequence comparison is attracting persistent interest driven by data-intensive applications. Hence, many AF procedures have been proposed in recent years, but a lack of a clearly defined benchmarking consensus hampers their performance assessment.
Here, we present a community resource (http://afproject.org) to establish standards for comparing alignment-free approaches across different areas of sequence-based research. We characterize 74 AF methods available in 24 software tools for five research applications, namely, protein sequence classification, gene tree inference, regulatory element detection, genome-based phylogenetic inference, and reconstruction of species trees under horizontal gene transfer and recombination events.
The interactive web service allows researchers to explore the performance of alignment-free tools relevant to their data types and analytical goals. It also allows method developers to assess their own algorithms and compare them with current state-of-the-art tools, accelerating the development of new, more accurate AF solutions.
无比对(AF)序列比对正受到数据密集型应用的持续关注。因此,近年来已经提出了许多 AF 方法,但缺乏明确定义的基准共识阻碍了它们的性能评估。
在这里,我们提供了一个社区资源(http://afproject.org),以建立跨不同基于序列的研究领域比较无比对方法的标准。我们对 24 个软件工具中的 74 种 AF 方法进行了特征描述,用于五个研究应用,即蛋白质序列分类、基因树推断、调控元件检测、基于基因组的系统发育推断,以及在水平基因转移和重组事件下重建物种树。
交互式网络服务允许研究人员探索与他们的数据类型和分析目标相关的无比对工具的性能。它还允许方法开发人员评估自己的算法,并将其与当前最先进的工具进行比较,从而加速新的、更准确的 AF 解决方案的开发。