Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, 431- 3192, Japan.
Department of Pediatrics, Showa University School of Medicine, Tokyo, 142-8555, Japan.
Sci Rep. 2024 Oct 21;14(1):24746. doi: 10.1038/s41598-024-75020-0.
Variant annotations are crucial for efficient identification of pathogenic variants. In this study, we retrospectively analyzed the utility of four annotation tools (allele frequency, ClinVar, SpliceAI, and Phenomatcher) in identifying 271 pathogenic single nucleotide and small insertion/deletion variants (SNVs/small indels). Although variant filtering based on allele frequency is essential for narrowing down on candidate variants, we found that 13 de novo pathogenic variants in autosomal dominant or X-linked dominant genes are registered in gnomADv4.0 or 54KJPN, with an allele frequency of less than 0.001%, suggesting that very rare variants in large cohort data can be pathogenic de novo variants. Notably, 38.4% candidate SNVs/small indels are registered in the ClinVar database as pathogenic or likely pathogenic, which highlights the significance of this database. SpliceAI can detect candidate variants affecting RNA splicing, leading to the identification of four variants located 11 to 50 bp away from the exon-intron boundary. Prioritization of candidate genes by proband phenotype using the PhenoMatcher module revealed that approximately 95% of the candidate genes had a maximum PhenoMatch score ≥ 0.6, suggesting the utility of phenotype-based variant prioritization. Our results suggest that a combination of multiple annotation tools and appropriate evaluation can improve the diagnosis of rare diseases.
变异注释对于有效识别致病性变异至关重要。在这项研究中,我们回顾性分析了四种注释工具(等位基因频率、ClinVar、SpliceAI 和 Phenomatcher)在识别 271 个致病性单核苷酸和小插入/缺失变异(SNVs/small indels)中的效用。虽然基于等位基因频率的变异过滤对于缩小候选变异至关重要,但我们发现,在常染色体显性或 X 连锁显性基因中,13 个新发性致病性变异在 gnomADv4.0 或 54KJPN 中有登记,其等位基因频率低于 0.001%,这表明在大型队列数据中非常罕见的变异也可能是新发性致病性变异。值得注意的是,38.4%的候选 SNVs/small indels 在 ClinVar 数据库中被登记为致病性或可能致病性,这突出了该数据库的重要性。SpliceAI 可以检测影响 RNA 剪接的候选变异,从而鉴定出四个位于外显子-内含子边界 11 至 50bp 处的变异。通过 PhenoMatcher 模块根据先证者表型对候选基因进行优先级排序,结果显示约 95%的候选基因的最大 PhenoMatch 得分≥0.6,这表明基于表型的变异优先级排序是有用的。我们的研究结果表明,结合多种注释工具和适当的评估可以提高罕见病的诊断率。