Yu Yaxuan, Ceredig Rhodri, Seoighe Cathal
School of Mathematics, Statistics and Applied Mathematics, National University of Ireland Galway, Galway H91 HX31, Ireland; and.
Discipline of Psychology, National University of Ireland Galway, Galway H91 HX31, Ireland.
J Immunol. 2017 Mar 1;198(5):2202-2210. doi: 10.4049/jimmunol.1601710. Epub 2017 Jan 23.
High-throughput sequencing data from TCRs and Igs can provide valuable insights into the adaptive immune response, but bioinformatics pipelines for analysis of these data are constrained by the availability of accurate and comprehensive repositories of TCR and Ig alleles. We have created an analytical pipeline to recover immune receptor alleles from genome sequencing data. Applying this pipeline to data from the 1000 Genomes Project we have created Lym1K, a collection of immune receptor alleles that combines known, well-supported alleles with novel alleles found in the 1000 Genomes Project data. We show that Lym1K leads to a significant improvement in the alignment of short read sequences from immune receptors and that the addition of novel alleles discovered from genome sequence data are likely to be particularly significant for comprehensive analysis of populations that are not currently well represented in existing repositories of immune alleles.
来自TCR和Ig的高通量测序数据能够为适应性免疫反应提供有价值的见解,但是用于分析这些数据的生物信息学流程受到TCR和Ig等位基因准确且全面的数据库可用性的限制。我们创建了一个分析流程,用于从基因组测序数据中恢复免疫受体等位基因。将此流程应用于千人基因组计划的数据,我们创建了Lym1K,这是一个免疫受体等位基因集合,它将已知的、有充分支持的等位基因与在千人基因组计划数据中发现的新等位基因相结合。我们表明,Lym1K显著改善了来自免疫受体的短读长序列的比对,并且从基因组序列数据中发现的新等位基因的添加对于全面分析目前在现有免疫等位基因数据库中代表性不足的人群可能特别重要。