Department of Immunotechnology, Lund University, Lund, Sweden.
Department of Immunotechnology, Lund University, Lund, Sweden.
Mol Immunol. 2018 Apr;96:61-68. doi: 10.1016/j.molimm.2018.02.013. Epub 2018 Feb 28.
Inference of antibody gene repertoires using transcriptome data has emerged as an alternative approach to the complex process of sequencing of adaptive immune receptor germline gene loci. The diversity introduced during rearrangement of immunoglobulin heavy chain variable (IGHV), diversity, and joining genes has however been identified as potentially affecting inference specificity. In this study, we have addressed this issue by analysing the nucleotide composition of unmutated human immunoglobulin heavy chains-encoding transcripts, focusing on the 3ö most bases of 47 IGHV germline genes. Although transcripts derived from some of the germline genes predominately incorporated the germline encoded base even at position 320, the last base of most IGHV genes, transcripts originating in other genes presented other nucleotides to the same extent at this position. In transcripts derived from two of the germline genes, IGHV3-1301 and IGHV4-30-201, the predominating nucleotide (G) was in fact not that of the gene (A). Hence, we suggest that inference of IGHV genes should be limited to bases preceding nucleotide 320, as inference beyond this would jeopardize the specificity of the inference process. The different degree of incorporation of the final base of the IGHV gene directly influences the distribution of amino acids of the ascending strand of the third complementarity determining region of the heavy chain. Thereby it influences the nature of this specificity-determining part of the antibody population. In addition, we also present data that indicate the existence of a common so far un-recognized allelic variant of IGHV3-7 that carries an A318G difference in relation to IGHV3-7*02.
使用转录组数据推断抗体基因库已成为替代测序适应性免疫受体胚系基因座这一复杂过程的另一种方法。然而,在重链可变区(IGHV)、多样性和连接基因重排过程中引入的多样性已被确定为可能影响推断特异性的因素。在这项研究中,我们通过分析未突变的人类免疫球蛋白重链编码转录本的核苷酸组成来解决这个问题,重点是 47 个 IGHV 胚系基因的前 3ö 个碱基。尽管一些胚系基因的转录本主要在位置 320 处掺入胚系编码碱基,但在大多数 IGHV 基因的最后一个碱基处,源自其他基因的转录本在该位置同样呈现出其他核苷酸。在源自两个胚系基因的转录本 IGHV3-1301 和 IGHV4-30-201 中,占主导地位的核苷酸(G)实际上不是基因的核苷酸(A)。因此,我们建议将 IGHV 基因的推断限制在碱基 320 之前,因为推断超过这个碱基会危及推断过程的特异性。IGHV 基因最后一个碱基的不同掺入程度直接影响重链第三个互补决定区上升链氨基酸的分布。从而影响抗体群体中决定特异性的这部分的性质。此外,我们还提供了数据,表明存在一种常见的、迄今为止尚未被识别的 IGHV3-7 等位变体,与 IGHV3-7*02 相比,它在 A318G 处存在差异。