Paternoster Lavinia, Budu-Aggrey Ashley, Brown Sara J
1. MRC Integrative Epidemiology Unit, Bristol Medical School, Population Health Sciences, The University of Bristol, Bristol, BS8 2BN, UK.
2. Centre for Genomics and Experimental Medicine, Institute for Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
Wellcome Open Res. 2024 May 20;7:36. doi: 10.12688/wellcomeopenres.17657.2. eCollection 2022.
Null mutations within the filaggrin ( ) gene are established genetic risk factors for atopic dermatitis. Studies of have typically used sequencing or bespoke genotyping. Large-scale population cohorts with genome-wide imputed data offer powerful genetic analysis opportunities, but bespoke genotyping is often not feasible in such studies. Therefore, we aimed to determine the quality of selected null genotype data extracted from genome-wide imputed sources, focussing on UK population data.
We compared the allele frequencies of three null mutations that could be detected by imputation (p.Arg501Ter, p.Arg2447Ter and p.Ser3247Ter; commonly referred to as R501X, R2447X and S3247X respectively) in directly genotyped and genome-wide imputed data in the ALSPAC cohort. Logistic regression analysis was used to test the association of atopic dermatitis with imputed and genotyped null mutations in ALSPAC and UK Biobank to investigate the usefulness of imputed data.
The three null mutations appear to be well imputed in datasets that use the Haplotype Reference Consortium (HRC) for imputation (0.3% discordance compared with directly genotyped data). However, a greater proportion of null alleles failed imputation compared to wild-type alleles. Despite the calling of mutations in imputed data being imperfect, they are still strongly associated with atopic dermatitis (p-values between 7x10 and 5x10 in UK Biobank).
HRC imputed data appears to be adequate for UK population-based genetic analysis of selected null mutations (p.Arg501Ter, p.Arg2447Ter and p.Ser3247Ter).
丝聚合蛋白(FLG)基因的无效突变是特应性皮炎已确定的遗传风险因素。对FLG的研究通常采用测序或定制基因分型。拥有全基因组推算数据的大规模人群队列提供了强大的遗传分析机会,但在这类研究中定制FLG基因分型往往不可行。因此,我们旨在确定从全基因组推算来源提取的选定FLG无效基因型数据的质量,重点关注英国人群数据。
我们比较了在阿冯纵向研究父母与儿童队列(ALSPAC队列)中直接基因分型数据和全基因组推算数据中,通过推算可检测到的三种FLG无效突变(p.Arg501Ter、p.Arg2447Ter和p.Ser3247Ter;通常分别称为R501X、R2447X和S3247X)的等位基因频率。使用逻辑回归分析来检验ALSPAC队列和英国生物银行中特应性皮炎与推算和基因分型的FLG无效突变之间的关联,以研究推算FLG数据的有用性。
在使用单倍型参考联盟(HRC)进行推算的数据集里,这三种FLG无效突变似乎被很好地推算出来了(与直接基因分型数据相比,不一致率为0.3%)。然而,与野生型等位基因相比,更大比例的无效等位基因未能被推算出来。尽管推算数据中FLG突变的判定并不完美,但它们仍与特应性皮炎密切相关(在英国生物银行中,p值在7×10⁻⁷至5×10⁻⁵之间)。
对于基于英国人群的选定FLG无效突变(p.Arg501Ter、p.Arg2447Ter和p.Ser3247Ter)的遗传分析,HRC推算数据似乎是足够的。