Landès C, Risler J L
Centre de Génétique Moléculaire du CNRS, Gif sur Yvette, France.
Comput Appl Biosci. 1994 Jul;10(4):453-4. doi: 10.1093/bioinformatics/10.4.453.
Fast sequence databanks search algorithms generally make use of hash tables and look for exactly matching words. An increased sensitivity--at the expense of a decreased selectivity--can be attained in the case of proteins by using a reduced amino acid alphabet. We propose here an alphabet reduced to 10 symbols, that we used in modified versions of the FASTP and SCAN programs. An application to the aminoacyl-tRNA synthetases shows that this technique may be useful in detecting distant relationships between proteins.
快速序列数据库搜索算法通常利用哈希表并寻找完全匹配的单词。对于蛋白质而言,通过使用精简的氨基酸字母表,可以在牺牲选择性降低的情况下提高灵敏度。我们在此提出一种精简为10个符号的字母表,并将其用于FASTP和SCAN程序的修改版本中。对氨酰-tRNA合成酶的应用表明,该技术可能有助于检测蛋白质之间的远缘关系。