Collins Andrew M, Ikutani Masashi, Puiu Daniela, Buck Gregory A, Nadkarni Aradhita, Gaeta Bruno
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia.
J Immunol. 2004 Jan 1;172(1):340-8. doi: 10.4049/jimmunol.172.1.340.
The accurate partitioning of Ig H chain V(H)DJ(H) junctions and L chain V(L)J(L) junctions is problematic. We have developed a statistical approach for the partitioning of such sequences, by analyzing the distribution of point mutations between a determined V gene segment and putative Ig regions. The establishment of objective criteria for the partitioning of sequences between V(H), D, and J(H) gene segments has allowed us to more carefully analyze intervening putative nontemplated (N) nucleotides. An analysis of 225 IgM H chain sequences, with five or fewer V mutations, led to the alignment of 199 sequences. Only 5.0% of sequences lacked N nucleotides at the V(H)D junction (N1), and 10.6% at the DJ(H) junction (N2). Long N regions (>9 nt) were seen in 20.6% of N1 regions and 17.1% of N2 regions. Using a statistical analysis based upon known features of N addition, and mutation analysis, two of these N regions aligned with D gene segments, and a third aligned with an inverted D gene segment. Nine additional sequences included possible alignments with a second D segment. Four of the remaining 40 long N1 regions included 5' sequences having six or more matches to V gene end motifs, which may be the result of V gene replacement. Such sequences were not seen in long N2 regions. The long N regions frequently seen in the expressed repertoire of human Ig gene rearrangements can therefore only partly be explained by V gene replacement and D-D fusion.
Ig重链V(H)DJ(H)连接区和轻链V(L)J(L)连接区的准确划分存在问题。我们通过分析确定的V基因片段与推定的Ig区域之间点突变的分布,开发了一种用于此类序列划分的统计方法。V(H)、D和J(H)基因片段之间序列划分的客观标准的建立,使我们能够更仔细地分析中间推定的非模板化(N)核苷酸。对225条具有5个或更少V突变的IgM重链序列进行分析,得到了199条序列的比对结果。只有5.0%的序列在V(H)D连接区(N1)缺乏N核苷酸,在DJ(H)连接区(N2)缺乏的比例为10.6%。在20.6%的N1区域和17.1%的N2区域中发现了长N区域(>9 nt)。基于N添加的已知特征进行统计分析和突变分析,其中两个N区域与D基因片段比对,第三个与反向D基因片段比对。另外9条序列包括与第二个D片段的可能比对。其余40个长N1区域中的4个包含与V基因末端基序有6个或更多匹配的5'序列,这可能是V基因替换的结果。在长N2区域中未发现此类序列。因此,在人类Ig基因重排的表达库中经常出现的长N区域,只能部分地由V基因替换和D-D融合来解释。