Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA.
J Immunol. 2013 Apr 15;190(8):3878-88. doi: 10.4049/jimmunol.1202547. Epub 2013 Mar 20.
Aberrant targeting of the enzyme activation-induced cytidine deaminase (AID) results in the accumulation of somatic mutations in ≈ 25% of expressed genes in germinal center B cells. Observations in Ung(-/-) Msh2(-/-) mice suggest that many other genes efficiently repair AID-induced lesions, so that up to 45% of genes may actually be targeted by AID. It is important to understand the mechanisms that recruit AID to certain genes, because this mistargeting represents an important risk for genome instability. We hypothesize that several mechanisms combine to target AID to each locus. To resolve which mechanisms affect AID targeting, we analyzed 7.3 Mb of sequence data, along with the regulatory context, from 83 genes in Ung(-/-) Msh2(-/-) mice to identify common properties of AID targets. This analysis identifies three transcription factor binding sites (E-box motifs, along with YY1 and C/EBP-β binding sites) that may work together to recruit AID. Based on previous knowledge and these newly discovered features, a classification tree model was built to predict genome-wide AID targeting. Using this predictive model, we were able to identify a set of 101 high-interest genes that are likely targets of AID.
酶激活诱导胞嘧啶脱氨酶(AID)的靶向异常导致生发中心 B 细胞中约 25%表达基因的体细胞突变积累。Ung(-/-) Msh2(-/-) 小鼠的观察结果表明,许多其他基因能有效地修复 AID 诱导的损伤,因此多达 45%的基因实际上可能是 AID 的靶点。了解将 AID 募集到特定基因的机制非常重要,因为这种靶向错误是基因组不稳定性的一个重要风险因素。我们假设有几种机制共同将 AID 靶向到每个基因座。为了解决哪些机制影响 AID 的靶向,我们分析了 Ung(-/-) Msh2(-/-) 小鼠中 83 个基因的 7.3Mb 序列数据及其调控背景,以确定 AID 靶标的共同特性。这项分析确定了三个转录因子结合位点(E 盒基序,以及 YY1 和 C/EBP-β 结合位点),它们可能协同作用来招募 AID。基于先前的知识和这些新发现的特征,构建了一个分类树模型来预测全基因组范围内 AID 的靶向。使用这个预测模型,我们能够识别出一组 101 个可能是 AID 靶点的高关注度基因。