Program in Molecular and Translational Medicine, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111, USA.
Proteins. 2010 Jul;78(9):2058-74. doi: 10.1002/prot.22722.
Predicting the phenotypes of missense mutations uncovered by large-scale sequencing projects is an important goal in computational biology. High-confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error-prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to infer the phenotypes of 204 single-site mutants, 79 of them deleterious and 125 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotype prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD-SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position-specific scoring matrix values is more predictive than the wild-type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins.
预测大规模测序项目中发现的错义突变的表型是计算生物学中的一个重要目标。高可信度的预测可以帮助将实验和关联研究集中在那些最有可能与突变和疾病之间因果关系相关的突变上。作为进一步开发这些方法的辅助手段,我们已经衍生出一组人类胱硫醚β合酶酶结构域的随机突变。这种酶是一种二聚体蛋白,催化丝氨酸和同型半胱氨酸的缩合反应,生成胱硫醚。缺乏这种酶的酵母不能在缺乏半胱氨酸来源的培养基中生长,而将功能性人类 CBS 转染到缺乏内源性酶的酵母菌株中可以成功地为缺失的基因提供补偿。我们使用易错 Taq 聚合酶的 PCR 诱变产生了 948 个菌落,并比较了有或没有半胱氨酸来源时的细胞生长情况,作为 CBS 功能的衡量标准。我们能够推断出 204 个单点突变体的表型,其中 79 个是有害的,125 个是中性的。该集合用于测试六种公开可用的用于预测错义突变表型的预测方法的准确性:SIFT、PolyPhen、PMut、SNPs3D、PhD-SNP 和 nsSNPAnalyzer。排名靠前的方法是 PolyPhen、SIFT 和 nsSNPAnalyzer,它们的性能相似。使用核判别函数,我们发现位置特异性评分矩阵值的差异比野生型 PSSM 评分更具预测性,而在生物学相关复合物中的相对表面积比单体蛋白更具预测性。