Shi Wenqiang, Fornes Oriol, Mathelier Anthony, Wasserman Wyeth W
Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, Child & Family Research Institute, University of British Columbia, 950 28th Ave W, Vancouver, BC V5Z 4H4, Canada.
Bioinformatics Graduate Program, University of British Columbia, 2329 W Mall, Vancouver, BC V6T 1Z4, Canada.
Nucleic Acids Res. 2016 Dec 1;44(21):10106-10116. doi: 10.1093/nar/gkw691. Epub 2016 Aug 4.
Diseases and phenotypes caused by disrupted transcription factor (TF) binding are being identified, but progress is hampered by our limited capacity to predict such functional alterations. Improving predictions may be dependent on expanding the set of bona fide TF binding alterations. Allele-specific binding (ASB) events, where TFs preferentially bind to one of the two alleles at heterozygous sites, reveal the impact of sequence variations in altered TF binding. Here, we present the largest ASB compilation to our knowledge, 10 765 ASB events retrieved from 45 ENCODE ChIP-Seq data sets. Our analysis showed that ASB events were frequently associated with motif alterations of the ChIP'ed TF and potential partner TFs, allelic difference of DNase I hypersensitivity and allelic difference of histone modifications. For TF dimers bound symmetrically to DNA, ASB data revealed that central positions of the TF binding motifs were disproportionately important for binding. Lastly, the impact of variation on TF binding was predicted by a classification model incorporating all the investigated features of ASB events. Classification models using only DNase I hypersensitivity and sequence data exhibited predictive accuracy approaching the models with substantially more features. Taken together, the combination of ASB data and the classification model represents an important step toward elucidating regulatory variants across the human genome.
由转录因子(TF)结合破坏引起的疾病和表型正在被识别,但我们预测此类功能改变的能力有限,这阻碍了研究进展。改进预测可能依赖于扩大真正的TF结合改变的集合。等位基因特异性结合(ASB)事件,即TF在杂合位点优先结合两个等位基因之一,揭示了序列变异对TF结合改变的影响。在此,据我们所知,我们展示了最大的ASB汇编,从45个ENCODE ChIP-Seq数据集中检索到10765个ASB事件。我们的分析表明,ASB事件经常与ChIP'ed TF和潜在伙伴TF的基序改变、DNase I超敏反应的等位基因差异以及组蛋白修饰的等位基因差异相关。对于对称结合到DNA的TF二聚体,ASB数据表明TF结合基序的中心位置对结合尤为重要。最后,通过结合ASB事件所有研究特征的分类模型预测变异对TF结合的影响。仅使用DNase I超敏反应和序列数据的分类模型显示出接近具有更多特征的模型的预测准确性。综上所述,ASB数据和分类模型的结合是朝着阐明人类基因组调控变异迈出的重要一步。