Department of Computer Science, University of California Irvine, Irvine, CA, USA.
Bioinformatics. 2019 Dec 15;35(24):5363-5364. doi: 10.1093/bioinformatics/btz603.
BLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide 'seeds' for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almost exclusive use despite the existence of more accurate, but slower, algorithms. In this paper, we introduce the Basic Local Alignment for Networks Tool (BLANT). BLANT is the analog of BLAST, but for networks: given an input graph, it samples small, induced, k-node sub-graphs called k-graphlets. Graphlets have been used to classify networks, quantify structure, align networks both locally and globally, identify topology-function relationships and build taxonomic trees without the use of sequences. Given an input network, BLANT produces millions of graphlet samples in seconds-orders of magnitude faster than existing methods. BLANT offers sampled graphlets in various forms: distributions of graphlets or their orbits; graphlet degree or graphlet orbit degree vectors, the latter being compatible with ORCA; or an index to be used as the basis for seed-and-extend local alignments. We demonstrate BLANT's usefelness by using its indexing mode to find functional similarity between yeast and human PPI networks.
BLANT is written in C and is available at https://github.com/waynebhayes/BLANT/releases.
Supplementary data are available at Bioinformatics online.
BLAST 通过首先构建一个称为 k-mer 的小 k 字母子序列数据库来创建局部序列比对。来自不同区域的相同 k-mer 为更长的局部比对提供了“种子”。这种种子和扩展启发式方法使 BLAST 非常快速,尽管存在更准确但更慢的算法,但它已几乎被独占使用。在本文中,我们引入了网络的基本局部比对工具 (BLANT)。BLANT 是 BLAST 的模拟,但针对网络:给定一个输入图,它会采样小的、诱导的、k 节点子图,称为 k-图元。图元已被用于对网络进行分类、量化结构、在本地和全局对齐网络、识别拓扑-功能关系以及构建分类树,而无需使用序列。给定一个输入网络,BLANT 可以在几秒钟内生成数百万个图元样本,比现有方法快几个数量级。BLANT 以各种形式提供采样的图元:图元或其轨道的分布;图元度或图元轨道度向量,后者与 ORCA 兼容;或者用作种子和扩展局部比对的基础索引。我们通过使用其索引模式在酵母和人类 PPI 网络之间查找功能相似性来展示 BLANT 的有用性。
BLANT 是用 C 编写的,可在 https://github.com/waynebhayes/BLANT/releases 上获得。
补充数据可在 Bioinformatics 在线获得。