基于机器学习的未知意义种系变异的重新分类：RENOVO 算法。

Machine learning-based reclassification of germline variants of unknown significance: The RENOVO algorithm.

机构信息

Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy.

Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Milan, Italy; Biomedical Translational Imaging Centre, Nova Scotia Health Authority and IWK Health Centre, Halifax, NS B3K 6R8, Canada.

出版信息

Am J Hum Genet. 2021 Apr 1;108(4):682-695. doi: 10.1016/j.ajhg.2021.03.010. Epub 2021 Mar 23.

DOI:10.1016/j.ajhg.2021.03.010

PMID:33761318

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8059374/

Abstract

The increasing scope of genetic testing allowed by next-generation sequencing (NGS) dramatically increased the number of genetic variants to be interpreted as pathogenic or benign for adequate patient management. Still, the interpretation process often fails to deliver a clear classification, resulting in either variants of unknown significance (VUSs) or variants with conflicting interpretation of pathogenicity (CIP); these represent a major clinical problem because they do not provide useful information for decision-making, causing a large fraction of genetically determined disease to remain undertreated. We developed a machine learning (random forest)-based tool, RENOVO, that classifies variants as pathogenic or benign on the basis of publicly available information and provides a pathogenicity likelihood score (PLS). Using the same feature classes recommended by guidelines, we trained RENOVO on established pathogenic/benign variants in ClinVar (training set accuracy = 99%) and tested its performance on variants whose interpretation has changed over time (test set accuracy = 95%). We further validated the algorithm on additional datasets including unreported variants validated either through expert consensus (ENIGMA) or laboratory-based functional techniques (on BRCA1/2 and SCN5A). On all datasets, RENOVO outperformed existing automated interpretation tools. On the basis of the above validation metrics, we assigned a defined PLS to all existing ClinVar VUSs, proposing a reclassification for 67% with >90% estimated precision. RENOVO provides a validated tool to reduce the fraction of uninterpreted or misinterpreted variants, tackling an area of unmet need in modern clinical genetics.

摘要

下一代测序 (NGS) 允许的遗传检测范围不断扩大，极大地增加了需要进行致病性或良性解释的遗传变异数量，以实现充分的患者管理。然而，解释过程往往无法提供明确的分类，导致出现意义不明的变异 (VUS) 或致病性解释冲突的变异 (CIP)；这些都是主要的临床问题，因为它们无法为决策提供有用信息，导致很大一部分遗传性疾病得不到充分治疗。我们开发了一种基于机器学习（随机森林）的工具 RENOVO，它可以根据公开信息将变异分类为致病性或良性，并提供致病性可能性评分 (PLS)。我们使用指南推荐的相同特征类别对 RENOVO 进行训练，在 ClinVar 中对已建立的致病性/良性变异进行训练（训练集准确率=99%），并在随时间变化的解释变异上测试其性能（测试集准确率=95%）。我们还在其他数据集上验证了该算法，包括通过专家共识 (ENIGMA) 或基于实验室的功能技术（BRCA1/2 和 SCN5A）验证的未报告变异。在所有数据集上，RENOVO 都优于现有的自动化解释工具。根据上述验证指标，我们为所有现有的 ClinVar VUS 分配了一个定义明确的 PLS，提出了 67%的重新分类，估计准确率>90%。RENOVO 提供了一种经过验证的工具，可以减少未解释或解释错误的变异数量，解决了现代临床遗传学中未满足的需求领域。

相似文献

Machine learning-based reclassification of germline variants of unknown significance: The RENOVO algorithm.基于机器学习的未知意义种系变异的重新分类：RENOVO 算法。

Am J Hum Genet. 2021 Apr 1;108(4):682-695. doi: 10.1016/j.ajhg.2021.03.010. Epub 2021 Mar 23.

Accuracy of renovo predictions on variants reclassified over time.Renovo对随时间重新分类的变异的预测准确性。

J Transl Med. 2024 Jul 31;22(1):713. doi: 10.1186/s12967-024-05508-w.

Large-scale application of ClinGen-InSiGHT APC-specific ACMG/AMP variant classification criteria leads to substantial reduction in VUS.大规模应用 ClinGen-InSiGHT APC 特异性 ACMG/AMP 变异分类标准可显著减少 VUS。

Am J Hum Genet. 2024 Nov 7;111(11):2427-2443. doi: 10.1016/j.ajhg.2024.09.002. Epub 2024 Oct 1.

Machine learning models for accurate prioritization of variants of uncertain significance.用于准确确定意义未明变异优先级的机器学习模型。

Hum Mutat. 2022 Apr;43(4):449-460. doi: 10.1002/humu.24339. Epub 2022 Feb 19.

Systematic large-scale application of ClinGen InSiGHT -specific ACMG/AMP variant classification criteria substantially alleviates the burden of variants of uncertain significance in ClinVar and LOVD databases.ClinGen InSiGHT特定的美国医学遗传学与基因组学学会（ACMG）/美国病理学家协会（AMP）变异分类标准的系统大规模应用，显著减轻了ClinVar和LOVD数据库中意义未明变异的负担。

medRxiv. 2024 May 4:2024.05.03.24306761. doi: 10.1101/2024.05.03.24306761.

A new bioinformatics tool to help assess the significance of BRCA1 variants.一种新的生物信息学工具，用于帮助评估 BRCA1 变体的意义。

Hum Genomics. 2018 Jul 11;12(1):36. doi: 10.1186/s40246-018-0168-0.

Evaluation of performance of leading algorithms for variant pathogenicity predictions and designing a combinatory predictor method: application to Rett syndrome variants.用于变异致病性预测的领先算法性能评估及组合预测方法设计：应用于雷特综合征变异

PeerJ. 2019 Nov 27;7:e8106. doi: 10.7717/peerj.8106. eCollection 2019.

Reclassification of and variants of uncertain significance: a multifactorial analysis of multicentre prospective cohort.多中心前瞻性队列的多因素分析：和意义不明变异体的重新分类。

J Med Genet. 2018 Dec;55(12):794-802. doi: 10.1136/jmedgenet-2018-105565. Epub 2018 Nov 10.

and Testing through Next Generation Sequencing in a Small Cohort of Italian Breast/Ovarian Cancer Patients: Novel Pathogenic and Unknown Clinical Significance Variants.对一小部分意大利乳腺癌/卵巢癌患者进行下一代测序检测：新的致病性和未知临床意义的变异体。

Int J Mol Sci. 2019 Jul 12;20(14):3442. doi: 10.3390/ijms20143442.

Gene-specific machine learning model to predict the pathogenicity of variants.用于预测变异致病性的基因特异性机器学习模型。

Front Genet. 2022 Sep 30;13:982930. doi: 10.3389/fgene.2022.982930. eCollection 2022.

引用本文的文献

Multi-class machine learning-based classification of SCID-related genetic variants.基于多类机器学习的重症联合免疫缺陷相关基因变异分类

Immunol Res. 2025 Sep 11;73(1):129. doi: 10.1007/s12026-025-09685-8.

RENOVO-NF1 accurately predicts NF1 missense variant pathogenicity.RENOVO-NF1能准确预测1型神经纤维瘤病错义变异的致病性。

Hum Genomics. 2025 Aug 31;19(1):106. doi: 10.1186/s40246-025-00803-z.

Modeling of Charcot-Marie-Tooth disease in zebrafish.斑马鱼中夏科-马里-图思病的建模

Front Mol Neurosci. 2025 Aug 4;18:1641793. doi: 10.3389/fnmol.2025.1641793. eCollection 2025.

The Role of Artificial Intelligence in Identifying Gene Variants and Improving Diagnosis.人工智能在识别基因变异和改善诊断方面的作用。

Genes (Basel). 2025 May 7;16(5):560. doi: 10.3390/genes16050560.

A novel seven-tier framework for the classification of MEFV missense variants using adaptive and rigid classifiers.一种使用自适应和刚性分类器对MEFV错义变异进行分类的新型七层框架。

Sci Rep. 2025 Mar 17;15(1):9054. doi: 10.1038/s41598-025-94142-7.

Unlocking precision medicine: clinical applications of integrating health records, genetics, and immunology through artificial intelligence.开启精准医学：通过人工智能整合健康记录、遗传学和免疫学的临床应用

J Biomed Sci. 2025 Feb 7;32(1):16. doi: 10.1186/s12929-024-01110-w.

An ensemble machine learning-based performance evaluation identifies top In-Silico pathogenicity prediction methods that best classify driver mutations in cancer.基于集成机器学习的性能评估确定了能够对癌症驱动突变进行最佳分类的顶级计算机模拟致病性预测方法。

BioData Min. 2025 Jan 20;18(1):7. doi: 10.1186/s13040-024-00420-x.

Rare disease genomics and precision medicine.罕见病基因组学与精准医学。

Genomics Inform. 2024 Dec 3;22(1):28. doi: 10.1186/s44342-024-00032-1.

Identification of Novel Potential Predisposing Variants in Familial Acute Myeloid Leukemia.鉴定家族性急性髓系白血病的新潜在易感变异。

Cancer Rep (Hoboken). 2024 Aug;7(8):e2141. doi: 10.1002/cnr2.2141.

Accuracy of renovo predictions on variants reclassified over time.Renovo对随时间重新分类的变异的预测准确性。

J Transl Med. 2024 Jul 31;22(1):713. doi: 10.1186/s12967-024-05508-w.

本文引用的文献

Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records.动态可解释机器学习预测 ICU 患者死亡率：电子患者记录中高频数据的回顾性研究。

Lancet Digit Health. 2020 Apr;2(4):e179-e191. doi: 10.1016/S2589-7500(20)30018-2. Epub 2020 Mar 12.

Impact of a Cancer Gene Variant Reclassification Program Over a 20-Year Period.癌症基因变异重新分类计划在20年期间的影响。

JCO Precis Oncol. 2020 Aug 27;4. doi: 10.1200/PO.20.00020. eCollection 2020.

High-Throughput Reclassification of SCN5A Variants.高通量 SCN5A 变异体再分类。

Am J Hum Genet. 2020 Jul 2;107(1):111-123. doi: 10.1016/j.ajhg.2020.05.015. Epub 2020 Jun 12.

LEAP: Using machine learning to support variant classification in a clinical setting.LEAP：利用机器学习在临床环境中支持变异分类。

Hum Mutat. 2020 Jun;41(6):1079-1090. doi: 10.1002/humu.24011. Epub 2020 Apr 1.

Variants of uncertain significance in the era of high-throughput genome sequencing: a lesson from breast and ovary cancers.高通量基因组测序时代意义未明的变异：乳腺癌和卵巢癌的教训。

J Exp Clin Cancer Res. 2020 Mar 4;39(1):46. doi: 10.1186/s13046-020-01554-6.

Large scale multifactorial likelihood quantitative analysis of BRCA1 and BRCA2 variants: An ENIGMA resource to support clinical variant classification.大规模多因素似然定量分析 BRCA1 和 BRCA2 变体：支持临床变异分类的 ENIGMA 资源。

Hum Mutat. 2019 Sep;40(9):1557-1578. doi: 10.1002/humu.23818.

The use of ACMG secondary findings recommendations for general population screening: a policy statement of the American College of Medical Genetics and Genomics (ACMG).美国医学遗传学与基因组学学会（ACMG）关于将ACMG次要发现建议用于普通人群筛查的政策声明。

Genet Med. 2019 Jul;21(7):1467-1468. doi: 10.1038/s41436-019-0502-5. Epub 2019 Apr 25.

VarSome: the human genomic variant search engine.VarSome：人类基因组变异搜索引擎。

Bioinformatics. 2019 Jun 1;35(11):1978-1980. doi: 10.1093/bioinformatics/bty897.

ClinVar at five years: Delivering on the promise.ClinVar 五年：兑现承诺。

Hum Mutat. 2018 Nov;39(11):1623-1630. doi: 10.1002/humu.23641.

The clinical imperative for inclusivity: Race, ethnicity, and ancestry (REA) in genomics.临床需要包容性：基因组学中的种族、民族和血统（REA）。

Hum Mutat. 2018 Nov;39(11):1713-1720. doi: 10.1002/humu.23644.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验