Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy.
Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy; Biomedical Translational Imaging Centre, Nova Scotia Health Authority and IWK Health Centre, Halifax, NS B3K 6R8, Canada.
Am J Hum Genet. 2021 Apr 1;108(4):682-695. doi: 10.1016/j.ajhg.2021.03.010. Epub 2021 Mar 23.
The increasing scope of genetic testing allowed by next-generation sequencing (NGS) dramatically increased the number of genetic variants to be interpreted as pathogenic or benign for adequate patient management. Still, the interpretation process often fails to deliver a clear classification, resulting in either variants of unknown significance (VUSs) or variants with conflicting interpretation of pathogenicity (CIP); these represent a major clinical problem because they do not provide useful information for decision-making, causing a large fraction of genetically determined disease to remain undertreated. We developed a machine learning (random forest)-based tool, RENOVO, that classifies variants as pathogenic or benign on the basis of publicly available information and provides a pathogenicity likelihood score (PLS). Using the same feature classes recommended by guidelines, we trained RENOVO on established pathogenic/benign variants in ClinVar (training set accuracy = 99%) and tested its performance on variants whose interpretation has changed over time (test set accuracy = 95%). We further validated the algorithm on additional datasets including unreported variants validated either through expert consensus (ENIGMA) or laboratory-based functional techniques (on BRCA1/2 and SCN5A). On all datasets, RENOVO outperformed existing automated interpretation tools. On the basis of the above validation metrics, we assigned a defined PLS to all existing ClinVar VUSs, proposing a reclassification for 67% with >90% estimated precision. RENOVO provides a validated tool to reduce the fraction of uninterpreted or misinterpreted variants, tackling an area of unmet need in modern clinical genetics.
下一代测序 (NGS) 允许的遗传检测范围不断扩大,极大地增加了需要进行致病性或良性解释的遗传变异数量,以实现充分的患者管理。然而,解释过程往往无法提供明确的分类,导致出现意义不明的变异 (VUS) 或致病性解释冲突的变异 (CIP);这些都是主要的临床问题,因为它们无法为决策提供有用信息,导致很大一部分遗传性疾病得不到充分治疗。我们开发了一种基于机器学习(随机森林)的工具 RENOVO,它可以根据公开信息将变异分类为致病性或良性,并提供致病性可能性评分 (PLS)。我们使用指南推荐的相同特征类别对 RENOVO 进行训练,在 ClinVar 中对已建立的致病性/良性变异进行训练(训练集准确率=99%),并在随时间变化的解释变异上测试其性能(测试集准确率=95%)。我们还在其他数据集上验证了该算法,包括通过专家共识 (ENIGMA) 或基于实验室的功能技术(BRCA1/2 和 SCN5A)验证的未报告变异。在所有数据集上,RENOVO 都优于现有的自动化解释工具。根据上述验证指标,我们为所有现有的 ClinVar VUS 分配了一个定义明确的 PLS,提出了 67%的重新分类,估计准确率>90%。RENOVO 提供了一种经过验证的工具,可以减少未解释或解释错误的变异数量,解决了现代临床遗传学中未满足的需求领域。