Suppr超能文献

从全基因组 SNP 数据中准确推断出 MICA、MICB、HLA-E、HLA-F 和 HLA-G 等位基因的多人群数据。

Accurate multi-population imputation of MICA, MICB, HLA-E, HLA-F and HLA-G alleles from genome SNP data.

机构信息

Finnish Red Cross Blood Service, Research and Development, Helsinki, Finland.

Finnish Red Cross Blood Service, Blood Service Biobank, Vantaa, Finland.

出版信息

PLoS Comput Biol. 2024 Sep 16;20(9):e1011718. doi: 10.1371/journal.pcbi.1011718. eCollection 2024 Sep.

Abstract

In addition to the classical HLA genes, the major histocompatibility complex (MHC) harbors a high number of other polymorphic genes with less established roles in disease associations and transplantation matching. To facilitate studies of the non-classical and non-HLA genes in large patient and biobank cohorts, we trained imputation models for MICA, MICB, HLA-E, HLA-F and HLA-G alleles on genome SNP array data. We show, using both population-specific and multi-population 1000 Genomes references, that the alleles of these genes can be accurately imputed for screening and research purposes. The best imputation model for MICA, MICB, HLA-E, -F and -G achieved a mean accuracy of 99.3% (min, max: 98.6, 99.9). Furthermore, validation of the 1000 Genomes exome short-read sequencing-based allele calling against a clinical-grade reference data showed an average accuracy of 99.8%, testifying for the quality of the 1000 Genomes data as an imputation reference. We also fitted the models for Infinium Global Screening Array (GSA, Illumina, Inc.) and Axiom Precision Medicine Research Array (PMRA, Thermo Fisher Scientific Inc.) SNP content, with mean accuracies of 99.1% (97.2, 100) and 98.9% (97.4, 100), respectively.

摘要

除了经典的 HLA 基因外,主要组织相容性复合体(MHC)还拥有大量其他多态性基因,它们在疾病关联和移植匹配中的作用尚未确定。为了方便在大型患者和生物样本库队列中研究非经典和非 HLA 基因,我们在全基因组 SNP 阵列数据上针对 MICB、MICA、HLA-E、HLA-F 和 HLA-G 等位基因训练了 imputation 模型。我们使用特定于人群和多人群的 1000 Genomes 参考数据,表明这些基因的等位基因可以准确地进行筛查和研究目的的推断。对于 MICA、MICB、HLA-E、F 和 G 而言,最佳的 imputation 模型实现了 99.3%的平均准确度(最小,最大:98.6,99.9)。此外,针对临床级参考数据,对 1000 Genomes 外显子短读测序的等位基因调用进行 1000 Genomes 外显子短读测序的验证表明平均准确度为 99.8%,证明了 1000 Genomes 数据作为 imputation 参考的质量。我们还拟合了 Infinium Global Screening Array(GSA,Illumina,Inc.)和 Axiom Precision Medicine Research Array(PMRA,Thermo Fisher Scientific Inc.)的 SNP 内容模型,平均准确度分别为 99.1%(97.2,100)和 98.9%(97.4,100)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dac3/11426482/32243e618d9b/pcbi.1011718.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验