Dias Mafalda, Orenbuch Rose, Marks Debora S, Frazer Jonathan
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; University Pompeu Fabra, Barcelona, Spain.
Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
Am J Hum Genet. 2024 Dec 5;111(12):2589-2593. doi: 10.1016/j.ajhg.2024.10.011. Epub 2024 Nov 18.
There has been considerable progress in building models to predict the effect of missense substitutions in protein-coding genes, fueled in large part by progress in applying deep learning methods to sequence data. These models have the potential to enable clinical variant annotation on a large scale and hence increase the impact of patient sequencing in guiding diagnosis and treatment. To realize this potential, it is essential to provide reliable assessments of model performance, scope of applicability, and robustness. As a response to this need, the ClinGen Sequence Variant Interpretation Working Group, Pejaver et al., recently proposed a strategy for validation and calibration of in-silico predictions in the context of guidelines for variant annotation. While this work marks an important step forward, the strategy presented still has important limitations. We propose core principles and recommendations to overcome these limitations that can enable both more reliable and more impactful use of variant effect prediction models in the future.
在构建预测蛋白质编码基因中错义替换效应的模型方面已经取得了相当大的进展,这在很大程度上得益于将深度学习方法应用于序列数据所取得的进展。这些模型有可能实现大规模的临床变异注释,从而提高患者测序在指导诊断和治疗方面的影响力。为了实现这一潜力,提供对模型性能、适用范围和稳健性的可靠评估至关重要。作为对这一需求的回应,临床基因组序列变异解释工作组的佩贾弗等人最近在变异注释指南的背景下提出了一种用于验证和校准计算机预测的策略。虽然这项工作标志着向前迈出了重要一步,但所提出的策略仍然存在重要局限性。我们提出了核心原则和建议来克服这些局限性,以便未来能够更可靠、更有效地使用变异效应预测模型。