Suppr超能文献

变异前和变异后过滤策略对插补的影响。

Impact of pre- and post-variant filtration strategies on imputation.

作者信息

Charon Céline, Allodji Rodrigue, Meyer Vincent, Deleuze Jean-François

机构信息

CEA Paris-Saclay, Institut François Jacob, Centre National de Recherche en Génomique Humaine, 2 rue Gaston Crémieux, Evry, 91057, France.

Radiation Epidemiology Group CESP, Inserm Unit 1018, Gustave Roussy Université Paris Saclay, 114 rue Edouard Vaillant, Villejuif, 94805, France.

出版信息

Sci Rep. 2021 Mar 18;11(1):6214. doi: 10.1038/s41598-021-85333-z.

Abstract

Quality control (QC) methods for genome-wide association studies and fine mapping are commonly used for imputation, however they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1089 NCBI recorded individuals for additional validation. Without QC-based variant pre-filtration, we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) < 0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). Thus, to maintain confidence and enough SNVs, we propose here a two-step filtering procedure which allows less stringent filtering prior to imputation and post-imputation in order to increase the number of very rare and rare variants compared to conservative filtration methods.

摘要

全基因组关联研究和精细定位的质量控制(QC)方法通常用于插补,然而它们会导致许多单核苷酸多态性(SNP)的丢失。为了研究过滤对插补的影响,我们研究了对标记数量、其等位基因频率、插补质量得分和过滤后事件的直接影响。我们对1031名来自不同种族的基因分型个体进行了预分析,并将插补后的变异与1089名NCBI记录的个体进行比较以进行额外验证。在没有基于QC的变异预过滤的情况下,我们观察到未通过QC的SNP插补没有受损,而在进行预过滤时则出现了总体信息丢失。仅在非常罕见(5E-04-1E-03)和罕见变异(1E-03-5E-03)范围内发现有和没有预过滤的频率之间存在显著差异(p < 1E-04)。将过滤后的插补质量得分从0.3提高到0.8,无论是否进行QC预过滤,< 0.001的单核苷酸变异(SNV)数量都减少了2.5倍,非常罕见变异(5E-04)的数量减半。因此,为了保持可信度和足够数量的SNV,我们在此提出一种两步过滤程序,与保守过滤方法相比,该程序允许在插补前和插补后进行不太严格的过滤,以增加非常罕见和罕见变异的数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/575e/7973508/7bb3fde92277/41598_2021_85333_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验