Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY.
J Immunol. 2024 Sep 1;213(5):651-662. doi: 10.4049/jimmunol.2400131.
The expressed Ab repertoire is a critical determinant of immune-related phenotypes. Ab-encoding transcripts are distinct from other expressed genes because they are transcribed from somatically rearranged gene segments. Human Abs are composed of two identical H and L chain polypeptides derived from genes in IGH locus and one of two L chain loci. The combinatorial diversity that results from Ab gene rearrangement and the pairing of different H and L chains contributes to the immense diversity of the baseline Ab repertoire. During rearrangement, Ab gene selection is mediated by factors that influence chromatin architecture, promoter/enhancer activity, and V(D)J recombination. Interindividual variation in the composition of the Ab repertoire associates with germline variation in IGH, implicating polymorphism in Ab gene regulation. Determining how IGH variants directly mediate gene regulation will require integration of these variants with other functional genomic datasets. In this study, we argue that standard approaches using short reads have limited utility for characterizing regulatory regions in IGH at haplotype resolution. Using simulated and chromatin immunoprecipitation sequencing reads, we define features of IGH that limit use of short reads and a single reference genome, namely 1) the highly duplicated nature of the DNA sequence in IGH and 2) structural polymorphisms that are frequent in the population. We demonstrate that personalized diploid references enhance performance of short-read data for characterizing mappable portions of the locus, while also showing that long-read profiling tools will ultimately be needed to fully resolve functional impacts of IGH germline variation on expressed Ab repertoires.
抗体的表达库是免疫相关表型的关键决定因素。抗体编码转录本与其他表达基因不同,因为它们是从体细胞重排的基因片段转录而来的。人类抗体由来自 IGH 基因座的两个相同的 H 和 L 链多肽和两个 L 链基因座之一组成。抗体基因重排和不同 H 和 L 链的配对产生的组合多样性导致了基线抗体库的巨大多样性。在重排过程中,抗体基因的选择受到影响染色质结构、启动子/增强子活性和 V(D)J 重组的因素的介导。抗体库组成的个体间差异与 IGH 中的种系变异相关,暗示了抗体基因调控中的多态性。确定 IGH 变体如何直接介导基因调控,需要将这些变体与其他功能基因组数据集整合。在这项研究中,我们认为,使用短读长的标准方法在单倍型分辨率下对 IGH 中的调控区域进行特征描述的效用有限。我们使用模拟和染色质免疫沉淀测序读长,定义了限制 IGH 中短读长和单个参考基因组使用的特征,即 1)IGH 中 DNA 序列的高度重复性质,以及 2)在人群中频繁发生的结构多态性。我们证明了个性化的二倍体参考可以提高短读长数据对该基因座可映射部分进行特征描述的性能,同时也表明,最终需要使用长读长分析工具才能完全解析 IGH 种系变异对表达抗体库的功能影响。