Suppr超能文献

利用非负矩阵分解鉴定转录因子共结合模式。

Identification of transcription factor co-binding patterns with non-negative matrix factorization.

机构信息

Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway.

European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany.

出版信息

Nucleic Acids Res. 2024 Oct 14;52(18):e85. doi: 10.1093/nar/gkae743.

Abstract

Transcription factor (TF) binding to DNA is critical to transcription regulation. Although the binding properties of numerous individual TFs are well-documented, a more detailed comprehension of how TFs interact cooperatively with DNA is required. We present COBIND, a novel method based on non-negative matrix factorization (NMF) to identify TF co-binding patterns automatically. COBIND applies NMF to one-hot encoded regions flanking known TF binding sites (TFBSs) to pinpoint enriched DNA patterns at fixed distances. We applied COBIND to 5699 TFBS datasets from UniBind for 401 TFs in seven species. The method uncovered already established co-binding patterns and new co-binding configurations not yet reported in the literature and inferred through motif similarity and protein-protein interaction knowledge. Our extensive analyses across species revealed that 67% of the TFs shared a co-binding motif with other TFs from the same structural family. The co-binding patterns captured by COBIND are likely functionally relevant as they harbor higher evolutionarily conservation than isolated TFBSs. Open chromatin data from matching human cell lines further supported the co-binding predictions. Finally, we used single-molecule footprinting data from mouse embryonic stem cells to confirm that the COBIND-predicted co-binding events associated with some TFs likely occurred on the same DNA molecules.

摘要

转录因子 (TF) 与 DNA 的结合对于转录调控至关重要。尽管许多单个 TF 的结合特性已有详细记录,但需要更详细地了解 TF 如何与 DNA 协同相互作用。我们提出了 COBIND,这是一种基于非负矩阵分解 (NMF) 的新方法,可自动识别 TF 共同结合模式。COBIND 将 NMF 应用于已知 TF 结合位点 (TFBS) 侧翼的 one-hot 编码区域,以确定在固定距离处富集的 DNA 模式。我们将 COBIND 应用于来自 UniBind 的 5699 个 TFBS 数据集,涵盖了 7 个物种中的 401 个 TF。该方法揭示了已经建立的共同结合模式和新的共同结合构型,这些模式和构型尚未在文献中报道过,而是通过基序相似性和蛋白质-蛋白质相互作用知识推断出来的。我们在跨物种的广泛分析中发现,67%的 TF 与来自同一结构家族的其他 TF 共享一个共同结合基序。COBIND 捕获的共同结合模式可能具有功能相关性,因为它们比孤立的 TFBS 具有更高的进化保守性。来自匹配的人类细胞系的开放染色质数据进一步支持了共同结合预测。最后,我们使用来自小鼠胚胎干细胞的单分子足迹数据证实,COBIND 预测的与某些 TF 相关的共同结合事件很可能发生在同一 DNA 分子上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c65d/11472169/6e35b51b7fab/gkae743figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验