Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan.
Center for Advanced Medical Innovation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka, 812-8582, Japan.
BMC Genomics. 2018 Nov 1;19(1):790. doi: 10.1186/s12864-018-5169-9.
Although human leukocyte antigen (HLA) genotyping based on amplicon, whole exome sequence (WES), and RNA sequence data has been achieved in recent years, accurate genotyping from whole genome sequence (WGS) data remains a challenge due to the low depth. Furthermore, there is no method to identify the sequences of unknown HLA types not registered in HLA databases.
We developed a Bayesian model, called ALPHLARD, that collects reads potentially generated from HLA genes and accurately determines a pair of HLA types for each of HLA-A, -B, -C, -DPA1, -DPB1, -DQA1, -DQB1, and -DRB1 genes at 3rd field resolution. Furthermore, ALPHLARD can detect rare germline variants not stored in HLA databases and call somatic mutations from paired normal and tumor sequence data. We illustrate the capability of ALPHLARD using 253 WES data and 25 WGS data from Illumina platforms. By comparing the results of HLA genotyping from SBT and amplicon sequencing methods, ALPHLARD achieved 98.8% for WES data and 98.5% for WGS data at 2nd field resolution. We also detected three somatic point mutations and one case of loss of heterozygosity in the HLA genes from the WGS data.
ALPHLARD showed good performance for HLA genotyping even from low-coverage data. It also has a potential to detect rare germline variants and somatic mutations in HLA genes. It would help to fill in the current gaps in HLA reference databases and unveil the immunological significance of somatic mutations identified in HLA genes.
尽管近年来已经实现了基于扩增子、全外显子组序列 (WES) 和 RNA 序列数据的人类白细胞抗原 (HLA) 基因分型,但由于深度较低,从全基因组序列 (WGS) 数据中准确进行基因分型仍然是一个挑战。此外,尚无方法识别未在 HLA 数据库中注册的未知 HLA 类型的序列。
我们开发了一种贝叶斯模型,称为 ALPHLARD,它可以收集可能来自 HLA 基因的读取,并准确确定 HLA-A、-B、-C、-DPA1、-DPB1、-DQA1、-DQB1 和 -DRB1 基因的每一对 HLA 类型,分辨率达到第 3 字段。此外,ALPHLARD 可以检测未存储在 HLA 数据库中的罕见种系变异,并从配对的正常和肿瘤序列数据中检测体细胞突变。我们使用 253 个 WES 数据和 25 个来自 Illumina 平台的 WGS 数据说明了 ALPHLARD 的能力。通过比较 SBT 和扩增子测序方法的 HLA 基因分型结果,ALPHLARD 在第 2 字段分辨率下分别达到了 WES 数据的 98.8%和 WGS 数据的 98.5%。我们还从 WGS 数据中检测到了 HLA 基因中的三个体细胞点突变和一个杂合性丢失病例。
ALPHLARD 即使在低覆盖度数据中也表现出良好的 HLA 基因分型性能。它还有潜力检测 HLA 基因中的罕见种系变异和体细胞突变。它将有助于填补 HLA 参考数据库中的当前空白,并揭示 HLA 基因中鉴定出的体细胞突变的免疫学意义。