Integrating genetic variation with deep learning provides context for variants impacting transcription factor binding during embryogenesis.

作者信息

Sigalova Olga M, Forneris Mattia, Stojanovska Frosina, Zhao Bingqing, Viales Rebecca R, Rabinowitz Adam, Hammal Fayrouz, Ballester Benoît, Zaugg Judith B, Furlong Eileen E M

机构信息

European Molecular Biology Laboratory (EMBL), Genome Biology Unit, D-69117 Heidelberg, Germany.

European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, D-69117 Heidelberg, Germany.

出版信息

Genome Res. 2025 May 2;35(5):1138-1153. doi: 10.1101/gr.279652.124.

Abstract

Understanding how genetic variation impacts transcription factor (TF) binding remains a major challenge, limiting our ability to model disease-associated variants. Here, we used a highly controlled system of F crosses with extensive genetic diversity to profile allele-specific binding of four TFs at several time points during embryogenesis. Using a combined haplotype test, we identified 9%-18% of TF-bound regions impacted by genetic variation even for essential regulators. By expanding WASP (a tool for allele-specific read mapping) to examine indels, we increased detection of allelically imbalanced peaks by 30%-50%. This fine-grained "mutagenesis" can reconstruct functionalized binding motifs for all factors. To prioritize causal variants, we trained a convolutional neural network (Basenji) to accurately predict binding from DNA sequence. The model can also predict measured allelic imbalance for strong effect variants, providing a mechanistic interpretation for how the variant impacts binding. This reveals unexpected relationships between TFs, including potential cooperative pairs, and mechanisms of tissue-specific recruitment of the ubiquitous factor CTCF.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13f8/12047541/a8695fd60feb/1138f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索