Kopernik Arina, Sayganova Mariia, Zobkova Gaukhar, Doroschuk Natalia, Smirnova Anna, Molodtsova-Zolotukhina Daria, Sagaydak Olesya, Ryzhkova Oxana, Kutsev Sergey, Groznova Olga, Melikyan Lyusya, Bondarchuk Elizaveta, Woroncow Mary, Albert Eugene, Bogdanov Viktor, Volchkov Pavel
Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies, Moscow, Russia, 125315.
Evogen LLC, Moscow, Russia.
Sci Rep. 2025 Jan 29;15(1):3621. doi: 10.1038/s41598-025-87814-x.
With the development of next-generation sequencing (NGS) technologies it became possible to simultaneously analyze millions of variants. Despite the quality improvement, it is generally still required to confirm the variants before reporting. However, in recent years the dominant idea is that one could define the quality thresholds for "high quality" variants which do not require orthogonal validation. Despite that, no works to date report the concordance between variants from whole genome sequencing and their gold-standard Sanger validation. In this study we analyzed the concordance for 1756 WGS variants in order to establish the appropriate thresholds for high-quality variants filtering. Resulting thresholds allowed us to drastically reduce the number of variants which require validation, to 4.8% and 1.2% of the initial set for caller-agnostic (DP, AF) and caller-dependent (QUAL) thresholds, respectively.
随着下一代测序(NGS)技术的发展,同时分析数百万个变异成为可能。尽管质量有所提高,但通常仍需要在报告之前确认变异。然而,近年来的主流观点是,可以为“高质量”变异定义质量阈值,这些变异不需要正交验证。尽管如此,迄今为止尚无研究报告全基因组测序变异与其金标准桑格验证之间的一致性。在本研究中,我们分析了1756个全基因组测序变异的一致性,以确定高质量变异过滤的适当阈值。由此得到的阈值使我们能够将需要验证的变异数量大幅减少,对于与调用者无关的(深度、等位基因频率)阈值和与调用者相关的(质量值)阈值,分别降至初始集的4.8%和1.2%。