Kobayashi Yuya, Yang Shan, Nykamp Keith, Garcia John, Lincoln Stephen E, Topper Scott E
Invitae Corporation, 1400 16th St., San Francisco, CA, 94103, USA.
Genome Med. 2017 Feb 6;9(1):13. doi: 10.1186/s13073-017-0403-7.
The frequency of a variant in the general population is a key criterion used in the clinical interpretation of sequence variants. With certain exceptions, such as founder mutations, the rarity of a variant is a prerequisite for pathogenicity. However, defining the threshold at which a variant should be considered "too common" is challenging and therefore diagnostic laboratories have typically set conservative allele frequency thresholds.
Recent publications of large population sequencing data, such as the Exome Aggregation Consortium (ExAC) database, provide an opportunity to characterize with accuracy and precision the frequency distributions of very rare disease-causing alleles. Allele frequencies of pathogenic variants in ClinVar, as well as variants expected to be pathogenic through the nonsense-mediated decay (NMD) pathway, were analyzed to study the burden of pathogenic variants in 79 genes of clinical importance.
Of 1364 BRCA1 and BRCA2 variants that are well characterized as pathogenic or that are expected to lead to NMD, 1350 variants had an allele frequency of less than 0.0025%. The remaining 14 variants were previously published founder mutations. Importantly, we observed no difference in the distributions of pathogenic variants expected to be lead to NMD compared to those that are not. Therefore, we expanded the analysis to examine the distributions of NMD expected variants in 77 additional genes. These 77 genes were selected to represent a broad set of clinical areas, modes of inheritance, and penetrance. Among these variants, most (97.3%) had an allele frequency of less than 0.01%. Furthermore, pathogenic variants with allele frequencies greater than 0.01% were well characterized in publications and included many founder mutations.
The observations made in this study suggest that, with certain caveats, a very low allele frequency threshold can be adopted to more accurately interpret sequence variants.
一般人群中变异的频率是序列变异临床解释中使用的关键标准。除某些例外情况,如奠基者突变,变异的罕见性是致病性的先决条件。然而,确定一个变异应被视为“过于常见”的阈值具有挑战性,因此诊断实验室通常设定保守的等位基因频率阈值。
近期大量人群测序数据的发表,如外显子聚合联盟(ExAC)数据库,为准确和精确地表征极罕见致病等位基因的频率分布提供了机会。分析了ClinVar中致病变异的等位基因频率,以及通过无义介导的衰变(NMD)途径预期致病的变异,以研究79个具有临床重要性的基因中致病变异的负担。
在1364个已被充分表征为致病或预期导致NMD的BRCA1和BRCA2变异中,1350个变异的等位基因频率低于0.0025%。其余14个变异是先前发表的奠基者突变。重要的是,我们观察到预期导致NMD的致病变异与未导致NMD的致病变异在分布上没有差异。因此,我们扩大了分析范围,以检查另外77个基因中预期NMD变异的分布。选择这77个基因以代表广泛的临床领域、遗传模式和外显率。在这些变异中,大多数(97.3%)的等位基因频率低于0.01%。此外,等位基因频率大于0.01%的致病变异在出版物中有充分表征,并且包括许多奠基者突变。
本研究中的观察结果表明,在有一定注意事项的情况下,可以采用非常低的等位基因频率阈值来更准确地解释序列变异。