Kiyotani Kazuma, Mai Tu H, Nakamura Yusuke
Section of Hematology/Oncology, Department of Medicine, The University of Chicago, Chicago, IL, USA.
Committee on Clinical Pharmacology and Pharmacogenomics, The University of Chicago, Chicago, IL, USA.
J Hum Genet. 2017 Mar;62(3):397-405. doi: 10.1038/jhg.2016.141. Epub 2016 Nov 24.
Accurate human leukocyte antigen (HLA) genotyping is critical in studies involving the immune system. Several algorithms to estimate HLA genotypes from whole-exome data were developed. We compared the accuracy of seven algorithms, including Optitype, Polysolver and PHLAT, as well as investigated patterns and possible causes of miscalls using 12 clinical samples and 961 individuals from the 1000 Genomes Project. Optitype showed the highest accuracy of 97.2% for HLA class I alleles at the second field resolution, followed by 94.0% in Polysolver and 85.6% in PHLAT. In Optitype, 34 (21.1%) of 161 miscalls were across different serological types, and common miscalls were HLA-A26:01 to HLA-A25:01, HLA-B45:01 to HLA-B44:15 and HLA-C08:02 to HLA-C05:01 with error rates of 4.1%, 10.0% and 4.1%, respectively. In Polysolver, 193 (55.9%) of 345 miscalls occurred across different serological alleles, and a specific pattern of genotyping error from HLA-A25:01 to HLA-A26:01 was observed in 93.3% of HLA-A25:01 carriers, due to dropping of HLA-A25:01 sequence reads during the extraction process of HLA reads. In PHLAT, 147 (59.8%) of 246 miscalls in HLA-A were due to erroneous assignment of multiple alleles to either HLA-A01:22 or HLA-A01:81. These results suggest that careful considerations needed to be taken when using exome-based HLA class I genotyping data and applying these results in clinical settings.
准确的人类白细胞抗原(HLA)基因分型在涉及免疫系统的研究中至关重要。已经开发了几种从全外显子数据估计HLA基因型的算法。我们比较了七种算法的准确性,包括Optitype、Polysolver和PHLAT,并使用12个临床样本和来自千人基因组计划的961个人调查了错误分型的模式和可能原因。Optitype在第二字段分辨率下对HLA I类等位基因显示出最高的准确性,为97.2%,其次是Polysolver中的94.0%和PHLAT中的85.6%。在Optitype中,161个错误分型中有34个(21.1%)跨越不同血清学类型,常见的错误分型是HLA-A26:01到HLA-A25:01、HLA-B45:01到HLA-B44:15以及HLA-C08:02到HLA-C05:01,错误率分别为4.1%、10.0%和4.1%。在Polysolver中,345个错误分型中有193个(55.9%)发生在不同血清学等位基因之间,并且在93.3%的HLA-A25:01携带者中观察到从HLA-A25:01到HLA-A26:01的特定基因分型错误模式,这是由于在HLA读数提取过程中HLA-A25:01序列读数的丢失。在PHLAT中,HLA-A中246个错误分型中有147个(59.8%)是由于将多个等位基因错误地指定为HLA-A01:22或HLA-A01:81。这些结果表明,在使用基于外显子的HLA I类基因分型数据并将这些结果应用于临床环境时需要谨慎考虑。