Pearson William R
University of Virginia School of Medicine, Charlottesville, Virginia.
Curr Protoc Bioinformatics. 2016 Mar 24;53:3.9.1-3.9.25. doi: 10.1002/0471250953.bi0309s53.
The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons.
FASTA程序提供了一套全面的快速相似性搜索工具(fasta36、fastx36、tfastx36、fasty36、tfasty36),与BLAST软件包提供的工具类似,还有用于较慢的、最优的、局部和全局相似性搜索的程序(ssearch36、ggsearch36),以及用于短肽和寡核苷酸搜索的程序(fasts36、fastm36)。FASTA程序使用一种经验策略来估计统计显著性,该策略适用于一系列相似性评分矩阵和空位罚分,提高了比对边界准确性和搜索灵敏度。FASTA程序可以生成“类似BLAST”的比对和表格输出,以便于集成到现有的分析流程中,并且可以搜索小型代表性数据库,然后使用来自较小数据集的链接报告更大一组序列的结果。FASTA程序可与多种数据库格式配合使用,包括mySQL和postgreSQL数据库。这些程序还提供了一种将结构域和活性位点注释整合到比对中并突出功能关键残基突变状态的策略。这些协议描述了如何使用FASTA程序通过蛋白质与蛋白质、蛋白质与DNA以及DNA与DNA比较来表征蛋白质和DNA序列。