Department of Plant and Microbial Biology, University of California, Berkeley, California.
Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia.
Hum Mutat. 2019 Sep;40(9):1530-1545. doi: 10.1002/humu.23868. Epub 2019 Sep 3.
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.
准确预测基因组变异对表型的影响是计算生物学的主要目标,也是个性化医疗的重要贡献。计算预测可以帮助我们更好地理解包括癌症在内的遗传疾病的机制,但它们的采用需要进行彻底和无偏见的评估。胱硫醚-β-合酶(CBS)是一种酶,可催化从同型半胱氨酸到胱硫醚的转硫途径的第一步,其变异与人类高同型半胱氨酸血症和高胱氨酸尿症有关。我们在 CAGI 框架下创建了一个计算挑战,以评估不同方法在使用盲实验数据集预测 CBS 单个氨基酸替换的表型效应方面的表现如何。CAGI 参与者被要求根据突变的身份预测酵母的生长情况。使用多种指标评估了方法的性能。CBS 挑战强调了当基于人类疾病数据训练分类模型时,预测模型生物体外系统表型的困难。我们还讨论了预测已知良性和有害变体的难度差异,并确定了未来挑战中需要吸取的经验教训的方法和实验限制。