Department of Bioinformatics, Genentech, Inc., South San Francisco, California, United States of America.
PLoS One. 2009 Dec 14;4(12):e8311. doi: 10.1371/journal.pone.0008311.
Advancements in sequencing technologies have empowered recent efforts to identify polymorphisms and mutations on a global scale. The large number of variations and mutations found in these projects requires high-throughput tools to identify those that are most likely to have an impact on function. Numerous computational tools exist for predicting which mutations are likely to be functional, but none that specifically attempt to identify mutations that result in hyperactivation or gain-of-function. Here we present a modified version of the SIFT (Sorting Intolerant from Tolerant) algorithm that utilizes protein sequence alignments with homologous sequences to identify functional mutations based on evolutionary fitness. We show that this bi-directional SIFT (B-SIFT) is capable of identifying experimentally verified activating mutants from multiple datasets. B-SIFT analysis of large-scale cancer genotyping data identified potential activating mutations, some of which we have provided detailed structural evidence to support. B-SIFT could prove to be a valuable tool for efforts in protein engineering as well as in identification of functional mutations in cancer.
测序技术的进步使得最近能够在全球范围内识别多态性和突变。这些项目中发现的大量变异和突变需要高通量工具来识别那些最有可能影响功能的变异。有许多用于预测哪些突变可能具有功能的计算工具,但没有专门用于识别导致过度激活或获得功能的突变的工具。在这里,我们提出了一种修改后的 SIFT(从宽容到不容忍排序)算法,该算法利用与同源序列的蛋白质序列比对来根据进化适应性识别功能突变。我们表明,这种双向 SIFT(B-SIFT)能够从多个数据集识别经过实验验证的激活突变体。对大规模癌症基因分型数据的 B-SIFT 分析确定了潜在的激活突变,我们提供了详细的结构证据来支持其中一些突变。B-SIFT 可能成为蛋白质工程以及癌症中功能突变识别的有价值的工具。