• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过结构-序列优化捕获的可解释蛋白质-DNA相互作用

Interpretable protein-DNA interactions captured by structure-sequence optimization.

作者信息

Zhang Yafan, Silvernail Irene, Lin Zhuyang, Lin Xingcheng

机构信息

Bioinformatics Research Center, North Carolina State University, Raleigh, United States.

Department of Physics, North Carolina State University, Raleigh, United States.

出版信息

Elife. 2025 Jul 17;14:RP105565. doi: 10.7554/eLife.105565.

DOI:10.7554/eLife.105565
PMID:40673435
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12270484/
Abstract

Sequence-specific DNA recognition underlies essential processes in gene regulation, yet methods for simultaneous predictions of genomic DNA recognition sites and their binding affinity remain lacking. Here, we present the Interpretable protein-DNA Energy Associative (IDEA) model, a residue-level, interpretable biophysical model capable of predicting binding sites and affinities of DNA-binding proteins. By fusing structures and sequences of known protein-DNA complexes into an optimized energy model, IDEA enables direct interpretation of physicochemical interactions among individual amino acids and nucleotides. We demonstrate that this energy model can accurately predict DNA recognition sites and their binding strengths across various protein families. Additionally, the IDEA model is integrated into a coarse-grained simulation framework that quantitatively captures the absolute protein-DNA binding free energies. Overall, IDEA provides an integrated computational platform that alleviates experimental costs and biases in assessing DNA recognition and can be utilized for mechanistic studies of various DNA-recognition processes.

摘要

序列特异性DNA识别是基因调控中基本过程的基础,但目前仍缺乏同时预测基因组DNA识别位点及其结合亲和力的方法。在此,我们提出了可解释的蛋白质-DNA能量关联(IDEA)模型,这是一种残基水平、可解释的生物物理模型,能够预测DNA结合蛋白的结合位点和亲和力。通过将已知蛋白质-DNA复合物的结构和序列融合到一个优化的能量模型中,IDEA能够直接解释单个氨基酸和核苷酸之间的物理化学相互作用。我们证明,这种能量模型可以准确预测各种蛋白质家族的DNA识别位点及其结合强度。此外,IDEA模型被整合到一个粗粒度模拟框架中,该框架定量地捕捉了蛋白质-DNA的绝对结合自由能。总体而言,IDEA提供了一个综合计算平台,可减轻评估DNA识别过程中的实验成本和偏差,并可用于各种DNA识别过程的机制研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/bd323e5065ca/elife-105565-sa3-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d1d23f2422dd/elife-105565-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/2040ec5d42dd/elife-105565-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/19a3ffc5372e/elife-105565-fig2-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b92564c24b14/elife-105565-fig2-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/ebeb3a6e6c59/elife-105565-fig2-figsupp3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d06b79bc22e9/elife-105565-fig2-figsupp4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/9689eda36be8/elife-105565-fig2-figsupp5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/13016dd42415/elife-105565-fig2-figsupp6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/c986fcf39afa/elife-105565-fig2-figsupp7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/8214a002031f/elife-105565-fig2-figsupp8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d889e1239283/elife-105565-fig2-figsupp9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/390b4a79f15a/elife-105565-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/af16fb18c918/elife-105565-fig3-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/23b7511897cb/elife-105565-fig3-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/2762523f3ae3/elife-105565-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/9af611f60233/elife-105565-fig4-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/6a4ad283530d/elife-105565-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/e7040b5fe947/elife-105565-fig5-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b20eb6d9acdf/elife-105565-fig5-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b4f16fc96b79/elife-105565-app1-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/558022af3d55/elife-105565-sa3-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b12859feb49a/elife-105565-sa3-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/bd323e5065ca/elife-105565-sa3-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d1d23f2422dd/elife-105565-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/2040ec5d42dd/elife-105565-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/19a3ffc5372e/elife-105565-fig2-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b92564c24b14/elife-105565-fig2-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/ebeb3a6e6c59/elife-105565-fig2-figsupp3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d06b79bc22e9/elife-105565-fig2-figsupp4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/9689eda36be8/elife-105565-fig2-figsupp5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/13016dd42415/elife-105565-fig2-figsupp6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/c986fcf39afa/elife-105565-fig2-figsupp7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/8214a002031f/elife-105565-fig2-figsupp8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/d889e1239283/elife-105565-fig2-figsupp9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/390b4a79f15a/elife-105565-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/af16fb18c918/elife-105565-fig3-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/23b7511897cb/elife-105565-fig3-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/2762523f3ae3/elife-105565-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/9af611f60233/elife-105565-fig4-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/6a4ad283530d/elife-105565-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/e7040b5fe947/elife-105565-fig5-figsupp1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b20eb6d9acdf/elife-105565-fig5-figsupp2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b4f16fc96b79/elife-105565-app1-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/558022af3d55/elife-105565-sa3-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/b12859feb49a/elife-105565-sa3-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf7d/12270484/bd323e5065ca/elife-105565-sa3-fig3.jpg

相似文献

1
Interpretable protein-DNA interactions captured by structure-sequence optimization.通过结构-序列优化捕获的可解释蛋白质-DNA相互作用
Elife. 2025 Jul 17;14:RP105565. doi: 10.7554/eLife.105565.
2
Predicting Affinity Through Homology (PATH): Interpretable Binding Affinity Prediction with Persistent Homology.通过同源性预测亲和力(PATH):基于持久同源性的可解释结合亲和力预测
bioRxiv. 2024 Oct 21:2023.11.16.567384. doi: 10.1101/2023.11.16.567384.
3
Short-Term Memory Impairment短期记忆障碍
4
Multiscale Probabilistic Modeling: A Bayesian Approach to Augment Mechanistic Models of Cell Signaling with Machine-Learning Predictions of Binding Affinity.多尺度概率建模:一种利用结合亲和力的机器学习预测增强细胞信号传导机制模型的贝叶斯方法。
bioRxiv. 2025 Jul 9:2025.05.23.655795. doi: 10.1101/2025.05.23.655795.
5
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.
6
Predicting Affinity Through Homology (PATH): Interpretable binding affinity prediction with persistent homology.通过同源性预测亲和力(PATH):利用持久同源性进行可解释的结合亲和力预测。
PLoS Comput Biol. 2025 Jun 27;21(6):e1013216. doi: 10.1371/journal.pcbi.1013216. eCollection 2025 Jun.
7
A conserved cysteine in the DNA-binding domain of MmuPV1 E2 is required for replication .MmuPV1 E2的DNA结合结构域中的一个保守半胱氨酸是复制所必需的。
J Virol. 2025 Jan 31;99(1):e0142324. doi: 10.1128/jvi.01423-24. Epub 2024 Dec 12.
8
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
9
Exploring protein-mediated compaction of DNA by coarse-grained simulations and unsupervised learning.通过粗粒度模拟和无监督学习探索蛋白质介导的DNA压缩。
Biophys J. 2024 Sep 17;123(18):3231-3241. doi: 10.1016/j.bpj.2024.07.023. Epub 2024 Jul 23.
10
GeoNet enables the accurate prediction of protein-ligand binding sites through interpretable geometric deep learning.GeoNet通过可解释的几何深度学习实现对蛋白质-配体结合位点的准确预测。
Structure. 2024 Dec 5;32(12):2435-2448.e5. doi: 10.1016/j.str.2024.10.011. Epub 2024 Nov 1.

本文引用的文献

1
DNAproDB: an updated database for the automated and interactive analysis of protein-DNA complexes.DNAproDB:用于蛋白质-DNA复合物自动化和交互式分析的更新数据库。
Nucleic Acids Res. 2025 Jan 6;53(D1):D396-D402. doi: 10.1093/nar/gkae970.
2
Geometric deep learning of protein-DNA binding specificity.蛋白质-DNA 结合特异性的几何深度学习。
Nat Methods. 2024 Sep;21(9):1674-1683. doi: 10.1038/s41592-024-02372-w. Epub 2024 Aug 5.
3
Structure-based learning to predict and model protein-DNA interactions and transcription-factor co-operativity in -regulatory elements.
基于结构的学习,用于预测和建模调控元件中的蛋白质-DNA相互作用及转录因子协同作用。
NAR Genom Bioinform. 2024 Jun 12;6(2):lqae068. doi: 10.1093/nargab/lqae068. eCollection 2024 Jun.
4
RACER-m leverages structural features for sparse T cell specificity prediction.RACER-m 利用结构特征进行稀疏 T 细胞特异性预测。
Sci Adv. 2024 May 17;10(20):eadl0161. doi: 10.1126/sciadv.adl0161. Epub 2024 May 15.
5
Accurate structure prediction of biomolecular interactions with AlphaFold 3.利用 AlphaFold 3 进行生物分子相互作用的精确结构预测。
Nature. 2024 Jun;630(8016):493-500. doi: 10.1038/s41586-024-07487-w. Epub 2024 May 8.
6
Predicting DNA structure using a deep learning method.使用深度学习方法预测 DNA 结构。
Nat Commun. 2024 Feb 9;15(1):1243. doi: 10.1038/s41467-024-45191-5.
7
Explicit ion modeling predicts physicochemical interactions for chromatin organization.离子显型预测染色质组织的物理化学相互作用。
Elife. 2024 Jan 30;12:RP90073. doi: 10.7554/eLife.90073.
8
EquiPNAS: improved protein-nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks.EquiPNAS:利用基于蛋白质语言模型的等变深度图神经网络提高蛋白质-核酸结合位点预测。
Nucleic Acids Res. 2024 Mar 21;52(5):e27. doi: 10.1093/nar/gkae039.
9
Brewing COFFEE: A Sequence-Specific Coarse-Grained Energy Function for Simulations of DNA-Protein Complexes.酿造咖啡:用于DNA-蛋白质复合物模拟的序列特异性粗粒度能量函数。
J Chem Theory Comput. 2024 Feb 13;20(3):1398-1413. doi: 10.1021/acs.jctc.3c00833. Epub 2024 Jan 19.
10
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning.基于预训练蛋白质语言模型和对比学习的蛋白质-DNA 结合位点预测。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad488.