Finnish Red Cross Blood Service, Research and Development, Helsinki, Finland.
Finnish Red Cross Blood Service, Blood Service Biobank, Vantaa, Finland.
PLoS Comput Biol. 2024 Sep 16;20(9):e1011718. doi: 10.1371/journal.pcbi.1011718. eCollection 2024 Sep.
In addition to the classical HLA genes, the major histocompatibility complex (MHC) harbors a high number of other polymorphic genes with less established roles in disease associations and transplantation matching. To facilitate studies of the non-classical and non-HLA genes in large patient and biobank cohorts, we trained imputation models for MICA, MICB, HLA-E, HLA-F and HLA-G alleles on genome SNP array data. We show, using both population-specific and multi-population 1000 Genomes references, that the alleles of these genes can be accurately imputed for screening and research purposes. The best imputation model for MICA, MICB, HLA-E, -F and -G achieved a mean accuracy of 99.3% (min, max: 98.6, 99.9). Furthermore, validation of the 1000 Genomes exome short-read sequencing-based allele calling against a clinical-grade reference data showed an average accuracy of 99.8%, testifying for the quality of the 1000 Genomes data as an imputation reference. We also fitted the models for Infinium Global Screening Array (GSA, Illumina, Inc.) and Axiom Precision Medicine Research Array (PMRA, Thermo Fisher Scientific Inc.) SNP content, with mean accuracies of 99.1% (97.2, 100) and 98.9% (97.4, 100), respectively.
除了经典的 HLA 基因外,主要组织相容性复合体(MHC)还拥有大量其他多态性基因,它们在疾病关联和移植匹配中的作用尚未确定。为了方便在大型患者和生物样本库队列中研究非经典和非 HLA 基因,我们在全基因组 SNP 阵列数据上针对 MICB、MICA、HLA-E、HLA-F 和 HLA-G 等位基因训练了 imputation 模型。我们使用特定于人群和多人群的 1000 Genomes 参考数据,表明这些基因的等位基因可以准确地进行筛查和研究目的的推断。对于 MICA、MICB、HLA-E、F 和 G 而言,最佳的 imputation 模型实现了 99.3%的平均准确度(最小,最大:98.6,99.9)。此外,针对临床级参考数据,对 1000 Genomes 外显子短读测序的等位基因调用进行 1000 Genomes 外显子短读测序的验证表明平均准确度为 99.8%,证明了 1000 Genomes 数据作为 imputation 参考的质量。我们还拟合了 Infinium Global Screening Array(GSA,Illumina,Inc.)和 Axiom Precision Medicine Research Array(PMRA,Thermo Fisher Scientific Inc.)的 SNP 内容模型,平均准确度分别为 99.1%(97.2,100)和 98.9%(97.4,100)。