HLA*IMP——一种从 SNP 基因型推断经典 HLA 等位基因的集成框架。

HLA*IMP--an integrated framework for imputing classical HLA alleles from SNP genotypes.

机构信息

Department of Statistics, University of Oxford, Oxford, UK.

出版信息

Bioinformatics. 2011 Apr 1;27(7):968-72. doi: 10.1093/bioinformatics/btr061. Epub 2011 Feb 7.

DOI:10.1093/bioinformatics/btr061

PMID:21300701

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3065693/

Abstract

MOTIVATION

Genetic variation at classical HLA alleles influences many phenotypes, including susceptibility to autoimmune disease, resistance to pathogens and the risk of adverse drug reactions. However, classical HLA typing methods are often prohibitively expensive for large-scale studies. We previously described a method for imputing classical alleles from linked SNP genotype data. Here, we present a modification of the original algorithm implemented in a freely available software suite that combines local data preparation and QC with probabilistic imputation through a remote server.

RESULTS

We introduce two modifications to the original algorithm. First, we present a novel SNP selection function that leads to pronounced increases (up by 40% in some scenarios) in call rate. Second, we develop a parallelized model building algorithm that allows us to process a reference set of over 2500 individuals. In a validation experiment, we show that our framework produces highly accurate HLA type imputations at class I and class II loci for independent datasets: at call rates of 95-99%, imputation accuracy is between 92% and 98% at the four-digit level and over 97% at the two-digit level. We demonstrate utility of the method through analysis of a genome-wide association study for psoriasis where there is a known classical HLA risk allele (HLA-C06:02). We show that the imputed allele shows stronger association with disease than any single SNP within the region. The imputation framework, HLAIMP, provides a powerful tool for dissecting the architecture of genetic risk within the HLA.

AVAILABILITY

HLA*IMP, implemented in C++ and Perl, is available from http://oxfordhla.well.ox.ac.uk and is free for academic use.

摘要

动机

经典 HLA 等位基因的遗传变异影响许多表型，包括自身免疫性疾病易感性、对病原体的抵抗力和不良药物反应的风险。然而，经典 HLA 分型方法通常对于大规模研究来说过于昂贵。我们之前描述了一种从连锁 SNP 基因型数据推断经典等位基因的方法。在这里，我们提出了对原始算法的修改，该算法在一个免费提供的软件套件中实现，该套件将本地数据准备和 QC 与通过远程服务器进行的概率推断相结合。

结果

我们对原始算法进行了两项修改。首先，我们提出了一种新的 SNP 选择函数，该函数导致呼叫率显著提高（在某些情况下提高了 40%）。其次，我们开发了一种并行化模型构建算法，使我们能够处理超过 2500 个个体的参考集。在验证实验中，我们表明我们的框架在独立数据集上对 I 类和 II 类位点产生高度准确的 HLA 类型推断：在呼叫率为 95-99%的情况下，在四位数字水平的准确性在 92%到 98%之间，在两位数字水平的准确性超过 97%。我们通过对银屑病的全基因组关联研究进行分析证明了该方法的实用性，其中存在已知的经典 HLA 风险等位基因（HLA-C06:02）。我们表明，与该区域内的任何单个 SNP 相比，推断出的等位基因与疾病的相关性更强。HLAIMP 推断框架为剖析 HLA 内遗传风险的结构提供了强大的工具。

可用性

HLA*IMP 是用 C++和 Perl 实现的，可从 http://oxfordhla.well.ox.ac.uk 获得，可免费用于学术用途。

相似文献

HLA*IMP--an integrated framework for imputing classical HLA alleles from SNP genotypes.HLA*IMP——一种从 SNP 基因型推断经典 HLA 等位基因的集成框架。

Bioinformatics. 2011 Apr 1;27(7):968-72. doi: 10.1093/bioinformatics/btr061. Epub 2011 Feb 7.

A multi-ethnic reference panel to impute HLA classical and non-classical class I alleles in admixed samples: Testing imputation accuracy in an admixed sample from Brazil.用于在混合样本中推断 HLA 经典和非经典 I 类等位基因的多民族参考面板：在巴西的混合样本中测试推断准确性。

HLA. 2024 Jun;103(6):e15543. doi: 10.1111/tan.15543.

Imputation-Based HLA Typing with GWAS SNPs.基于 GWAS SNPs 的推断性 HLA 分型。

Methods Mol Biol. 2024;2809:127-143. doi: 10.1007/978-1-0716-3874-3_9.

Comparison of HLA allelic imputation programs.HLA 等位基因推算程序的比较。

PLoS One. 2017 Feb 16;12(2):e0172444. doi: 10.1371/journal.pone.0172444. eCollection 2017.

HIBAG--HLA genotype imputation with attribute bagging.HIBAG——基于属性装袋法的HLA基因型推算

Pharmacogenomics J. 2014 Apr;14(2):192-200. doi: 10.1038/tpj.2013.18. Epub 2013 May 28.

Multi-population classical HLA type imputation.多群体经典 HLA 类型推断。

PLoS Comput Biol. 2013;9(2):e1002877. doi: 10.1371/journal.pcbi.1002877. Epub 2013 Feb 14.

Significant variation between SNP-based HLA imputations in diverse populations: the last mile is the hardest.不同人群中基于单核苷酸多态性（SNP）的人类白细胞抗原（HLA）推断结果存在显著差异：最后一公里是最艰难的。

Pharmacogenomics J. 2018 May 22;18(3):367-376. doi: 10.1038/tpj.2017.7. Epub 2017 Apr 25.

Imputing amino acid polymorphisms in human leukocyte antigens.推断人类白细胞抗原中的氨基酸多态性。

PLoS One. 2013 Jun 6;8(6):e64683. doi: 10.1371/journal.pone.0064683. Print 2013.

Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles.构建并基准测试用于 HLA Ⅰ类和Ⅱ类等位基因推断的多民族参考面板。

Hum Mol Genet. 2019 Jun 15;28(12):2078-2092. doi: 10.1093/hmg/ddy443.

Accurate HLA type inference using a weighted similarity graph.利用加权相似图进行 HLA 类型的精确推断。

BMC Bioinformatics. 2010 Dec 14;11 Suppl 11(Suppl 11):S10. doi: 10.1186/1471-2105-11-S11-S10.

引用本文的文献

Advancements in Umbilical Cord Biobanking: A Comprehensive Review of Current Trends and Future Prospects.脐带生物样本库的进展：当前趋势与未来前景的全面综述

Stem Cells Cloning. 2024 Dec 5;17:41-58. doi: 10.2147/SCCAA.S481072. eCollection 2024.

Accurate multi-population imputation of MICA, MICB, HLA-E, HLA-F and HLA-G alleles from genome SNP data.从全基因组 SNP 数据中准确推断出 MICA、MICB、HLA-E、HLA-F 和 HLA-G 等位基因的多人群数据。

PLoS Comput Biol. 2024 Sep 16;20(9):e1011718. doi: 10.1371/journal.pcbi.1011718. eCollection 2024 Sep.

Efficient HLA imputation from sequential SNPs data by transformer.基于 Transformer 的基于序贯 SNP 数据的高效 HLA 推测

J Hum Genet. 2024 Oct;69(10):533-540. doi: 10.1038/s10038-024-01278-x. Epub 2024 Aug 2.

Rubella virus seropositivity after infection or vaccination as a risk factor for multiple sclerosis.风疹病毒感染或接种疫苗后的血清阳性作为多发性硬化症的一个风险因素。

Eur J Neurol. 2024 Oct;31(10):e16387. doi: 10.1111/ene.16387. Epub 2024 Jul 18.

Genetics of immune response to Epstein-Barr virus: prospects for multiple sclerosis pathogenesis.针对 Epstein-Barr 病毒的免疫反应的遗传学：多发性硬化症发病机制的前景。

Brain. 2024 Oct 3;147(10):3573-3582. doi: 10.1093/brain/awae110.

Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology.利用深度学习技术对全基因组和复杂基因组区域进行基因型推断的方法。

J Hum Genet. 2024 Oct;69(10):481-486. doi: 10.1038/s10038-023-01213-6. Epub 2024 Jan 15.

Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease.教程：识别驱动复杂疾病的 HLA 等位基因的统计遗传学指南。

Nat Protoc. 2023 Sep;18(9):2625-2641. doi: 10.1038/s41596-023-00853-4. Epub 2023 Jul 26.

Blood donor biobank and HLA imputation as a resource for HLA homozygous cells for therapeutic and research use.血液捐献者生物库和 HLA 推测可作为用于治疗和研究目的的 HLA 纯合子细胞的资源。

Stem Cell Res Ther. 2022 Oct 9;13(1):502. doi: 10.1186/s13287-022-03182-7.

Approaching Genetics Through the MHC Lens: Tools and Methods for HLA Research.从主要组织相容性复合体视角探讨遗传学：人类白细胞抗原研究的工具与方法

Front Genet. 2021 Dec 2;12:774916. doi: 10.3389/fgene.2021.774916. eCollection 2021.

Targeted analysis of genomic regions enriched in African ancestry reveals novel classical HLA alleles associated with asthma in Southwestern Europeans.靶向分析富含非洲血统的基因组区域揭示了与南欧哮喘相关的新的经典 HLA 等位基因。

Sci Rep. 2021 Dec 8;11(1):23686. doi: 10.1038/s41598-021-02893-w.

本文引用的文献

A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1.一项全基因组关联研究确定了新的银屑病易感基因座和 HLA-C 与 ERAP1 之间的相互作用。

Nat Genet. 2010 Nov;42(11):985-90. doi: 10.1038/ng.694. Epub 2010 Oct 17.

Human leukocyte antigen class I and II alleles in non-Hodgkin lymphoma etiology.人类白细胞抗原 I 类和 II 类等位基因与非霍奇金淋巴瘤的病因。

Blood. 2010 Jun 10;115(23):4820-3. doi: 10.1182/blood-2010-01-266775. Epub 2010 Apr 12.

Bone marrow transplantation for primary immunodeficiency diseases.骨髓移植治疗原发性免疫缺陷病。

Pediatr Clin North Am. 2010 Feb;57(1):207-37. doi: 10.1016/j.pcl.2009.12.004.

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.一种用于下一代全基因组关联研究的灵活且准确的基因型填充方法。

PLoS Genet. 2009 Jun;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. Epub 2009 Jun 19.

HLA and infectious diseases.人类白细胞抗原与传染病

Clin Microbiol Rev. 2009 Apr;22(2):370-85, Table of Contents. doi: 10.1128/CMR.00048-08.

A mechanism for the HLA-A*01-associated risk for EBV+ Hodgkin lymphoma and infectious mononucleosis.HLA - A*01相关的EBV阳性霍奇金淋巴瘤和传染性单核细胞增多症风险的一种机制。

Blood. 2008 Sep 15;112(6):2589-90. doi: 10.1182/blood-2008-06-162883.

Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs.单核苷酸多态性（SNPs）、常见拷贝数多态性和罕见拷贝数变异（CNVs）的整合基因型分型与关联分析。

Nat Genet. 2008 Oct;40(10):1253-60. doi: 10.1038/ng.237. Epub 2008 Sep 7.

Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project.八种主要组织相容性复合体单倍型的变异分析与基因注释：主要组织相容性复合体单倍型项目

Immunogenetics. 2008 Jan;60(1):1-18. doi: 10.1007/s00251-007-0262-2. Epub 2008 Jan 10.

A statistical method for predicting classical HLA alleles from SNP data.一种从单核苷酸多态性（SNP）数据预测经典人类白细胞抗原（HLA）等位基因的统计方法。

Am J Hum Genet. 2008 Jan;82(1):48-56. doi: 10.1016/j.ajhg.2007.09.001.

A second generation human haplotype map of over 3.1 million SNPs.一张包含超过310万个单核苷酸多态性的第二代人类单倍型图谱。

Nature. 2007 Oct 18;449(7164):851-61. doi: 10.1038/nature06258.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验