Barbitoff Yury A, Khmelkova Darya N, Pomerantseva Ekaterina A, Slepchenkov Aleksandr V, Zubashenko Nikita A, Mironova Irina V, Kaimonov Vladimir S, Polev Dmitrii E, Tsay Victoria V, Glotov Andrey S, Aseev Mikhail V, Shcherbak Sergey G, Glotov Oleg S, Isaev Arthur A, Predeus Alexander V
CerbaLab Ltd., St. Petersburg 199106, Russia.
Bioinformatics Institute, St. Petersburg 197342, Russia.
Natl Sci Rev. 2024 Sep 14;11(10):nwae326. doi: 10.1093/nsr/nwae326. eCollection 2024 Oct.
Population allele frequency is crucially important for accurate interpretation of known and novel variants in medical genetics. Recently, several large allele frequency databases, such as the Genome Aggregation Database (gnomAD), have been created to serve as a global reference for such studies. However, frequencies of many rare alleles vary dramatically between populations, and population-specific allele frequency is often more informative than the global one. Many countries and regions, including Russia, remain poorly studied from the genetic perspective. Here, we report the first successful attempt to integrate genetic information between major medical genetic laboratories in Russia. We construct RUSeq, an open, large-scale reference set of genetic variants by analyzing 7452 exome samples collected in two major Russian cities-Moscow and St. Petersburg. An ∼10-fold increase in sample size compared to previous studies allowed us to characterize extensive genetic diversity within the admixed Russian population with contributions from several major ancestral groups. We highlight 51 known pathogenic variants that are overrepresented in Russia compared to other European countries. We also identify several dozen high-impact variants that are present in healthy donors despite being annotated as pathogenic in ClinVar and falling within genes associated with autosomal dominant disorders. The constructed database of genetic variant frequencies in Russia has been made available to the medical genetics community through a variant browser available at http://ruseq.ru.
群体等位基因频率对于准确解读医学遗传学中的已知和新型变异至关重要。最近,已经创建了几个大型等位基因频率数据库,如基因组聚合数据库(gnomAD),作为此类研究的全球参考。然而,许多罕见等位基因的频率在不同群体之间差异很大,群体特异性等位基因频率通常比全球频率更具信息性。包括俄罗斯在内的许多国家和地区,从遗传学角度来看,研究仍然很少。在这里,我们报告了在俄罗斯主要医学遗传实验室之间整合遗传信息的首次成功尝试。我们通过分析在俄罗斯两个主要城市——莫斯科和圣彼得堡收集的7452个外显子样本,构建了RUSeq,这是一个开放的、大规模的遗传变异参考集。与之前的研究相比,样本量增加了约10倍,这使我们能够描述混合的俄罗斯人群中广泛的遗传多样性,其中有几个主要祖先群体的贡献。我们强调了51个已知的致病性变异,与其他欧洲国家相比,这些变异在俄罗斯的代表性过高。我们还鉴定出几十种高影响变异,这些变异在健康供体中存在,尽管在ClinVar中被注释为致病性,并且位于与常染色体显性疾病相关的基因内。通过http://ruseq.ru上的变异浏览器,已将构建的俄罗斯遗传变异频率数据库提供给医学遗传学领域。