Larjo Antti, Eveleigh Robert, Kilpeläinen Elina, Kwan Tony, Pastinen Tomi, Koskela Satu, Partanen Jukka
Finnish Red Cross Blood Service, Helsinki, Finland.
McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada.
Front Immunol. 2017 Dec 13;8:1815. doi: 10.3389/fimmu.2017.01815. eCollection 2017.
The human leukocyte antigen (HLA) genes code for proteins that play a central role in the function of the immune system by presenting peptide antigens to T cells. As HLA genes show extremely high genetic polymorphism, HLA typing at the allele level is demanding and is based on DNA sequencing. Determination of HLA alleles is warranted as HLA alleles are major genetic risk factors in autoimmune diseases and are matched in transplantation. Here, we compared the accuracy of several published HLA-typing algorithms that are based on next-generation sequencing (NGS) data. As genome sequencing is becoming increasingly common in research, we wanted to test how well HLA alleles can be deduced from genome data produced in studies with objectives other than HLA typing and in platforms not especially designed for HLA typing. The accuracies were assessed using datasets consisting of NGS data produced using an in-house sequencing platform, including the full 4 Mbp HLA segment, from 94 stem cell transplantation patients and exome sequences from 63 samples of the 1000 Genomes collection. In the patient dataset, none of the software gave perfect results for all the samples and genes when programs were used with the default settings. However, we found that ensemble prediction of the results or modifications of the settings could be used to improve accuracy. For the exome-only data, most of the algorithms did not perform very well. The results indicate that the use of these algorithms for accurate HLA allele determination is not straightforward when based on NGS data not especially targeted to the HLA typing and their accurate use requires HLA expertise.
人类白细胞抗原(HLA)基因编码的蛋白质通过将肽抗原呈递给T细胞,在免疫系统功能中发挥核心作用。由于HLA基因表现出极高的遗传多态性,基于DNA测序的等位基因水平的HLA分型要求很高。确定HLA等位基因是必要的,因为HLA等位基因是自身免疫性疾病的主要遗传风险因素,并且在移植中需要进行匹配。在这里,我们比较了几种基于下一代测序(NGS)数据的已发表HLA分型算法的准确性。随着基因组测序在研究中越来越普遍,我们想测试从并非以HLA分型为目的且并非专门为HLA分型设计的平台所产生的基因组数据中推断HLA等位基因的效果如何。使用由内部测序平台产生的NGS数据组成的数据集评估准确性,该数据集包括来自94例干细胞移植患者的完整4兆碱基HLA片段以及来自千人基因组计划63个样本的外显子序列。在患者数据集中,当程序使用默认设置时,没有一个软件对所有样本和基因都给出完美结果。然而,我们发现结果的集成预测或设置的修改可用于提高准确性。对于仅外显子的数据,大多数算法表现不佳。结果表明,基于并非专门针对HLA分型的NGS数据使用这些算法进行准确的HLA等位基因测定并非易事,并且其准确使用需要HLA专业知识。