School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece.
Molecular Diagnostics Laboratory, INRaSTES, National Center for Scientific Research NCSR Demokritos, 15341 Athens, Greece.
Biomolecules. 2022 Oct 24;12(11):1552. doi: 10.3390/biom12111552.
Implementation of next-generation sequencing (NGS) for the genetic analysis of hereditary diseases has resulted in a vast number of genetic variants identified daily, leading to inadequate variant interpretation and, consequently, a lack of useful clinical information for treatment decisions. Herein, we present MARGINAL 1.0.0, a machine learning (ML)-based software for the interpretation of rare and germline variants. MARGINAL software classifies variants into three categories, namely, (likely) pathogenic, of uncertain significance and (likely) benign, implementing the criteria established by the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG-AMP). We first annotated and variants using various sources. Then, we automatically implemented the ACMG-AMP criteria, and we finally constructed the ML model for variant classification. To maximize accuracy, we compared the performance of eight different ML algorithms in a classification scheme based on a serial combination of two classifiers. The model showed high predictive abilities with maximum accuracy of 92% and 98%, recall of 92% and 98% and specificity of 90% and 98% for the first and second classifiers, respectively. Our results indicate that using a gene and disease-specific ML automated software for clinical variant evaluation can minimize conflicting interpretations.
下一代测序(NGS)在遗传性疾病的基因分析中的实施导致每天都有大量的遗传变异被识别出来,这导致了变异解释不足,因此缺乏有用的临床信息来做出治疗决策。在此,我们介绍了基于机器学习(ML)的用于解释稀有和种系变异的软件 MARGINAL 1.0.0。MARGINAL 软件将变异分为三类,即(可能)致病性、意义不明和(可能)良性,实施了美国医学遗传学和基因组学学院和分子病理学协会(ACMG-AMP)制定的标准。我们首先使用各种来源注释变异。然后,我们自动实施了 ACMG-AMP 标准,最后构建了用于变异分类的 ML 模型。为了最大限度地提高准确性,我们在基于两个分类器的串联的分类方案中比较了八种不同 ML 算法的性能。该模型表现出了很高的预测能力,对于第一和第二个分类器,其准确率分别达到了 92%和 98%,召回率分别达到了 92%和 98%,特异性分别达到了 90%和 98%。我们的结果表明,使用针对特定基因和疾病的 ML 自动化软件进行临床变异评估可以最小化冲突的解释。