Li Weixi, Rehmeyer Cathryn J, Staben Chuck, Farman Mark L
Department of Biological Sciences, University of Kentucky, Lexington, 40546, USA.
Bioinformatics. 2005 May 1;21(9):2097-8. doi: 10.1093/bioinformatics/bti257. Epub 2005 Jan 25.
BLAST is a widely used alignment tool for detecting matches between a query sequence and entries in nucleotide sequence databases. Matches (high-scoring pairs, HSPs) are assigned a score based on alignment length and quality and, by default, are reported with the top-scoring matches listed first. For certain types of searches, however, this method of reporting is not optimal. This is particularly true when searching a genome sequence with a query that was derived from the same genome, or a closely related one. If the genome is complex and the assembly is far from complete, correct matches are often relegated to low positions in the results, where they may be easily overlooked. To rectify this problem, we developed TruMatch--a program that parses standard BLAST outputs and identifies HSPs that involve query segments with unique matches to the assembly. Candidates for bona fide matches between a query sequence and a genome assembly are listed at the top of the TruMatch output.
TruMatch is written in Perl and is freely available to non-commercial users via web download at the URL: http://genome.kbrin.uky.edu/fungi_tel/TruMatch/
BLAST是一种广泛使用的比对工具,用于检测查询序列与核苷酸序列数据库中的条目之间的匹配情况。匹配项(高分对,HSPs)根据比对长度和质量被赋予一个分数,并且默认情况下,报告时首先列出得分最高的匹配项。然而,对于某些类型的搜索,这种报告方法并非最佳。当使用源自同一基因组或密切相关基因组的查询搜索基因组序列时,尤其如此。如果基因组复杂且组装远未完成,正确的匹配项往往会在结果中排在较低位置,可能很容易被忽视。为了解决这个问题,我们开发了TruMatch——一个程序,它解析标准BLAST输出并识别涉及与组装有唯一匹配的查询片段的HSPs。查询序列与基因组组装之间真正匹配的候选项列在TruMatch输出的顶部。
TruMatch用Perl编写,非商业用户可通过网络下载免费获取,网址为:http://genome.kbrin.uky.edu/fungi_tel/TruMatch/