• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

预测真核转录因子的 DNA 结合特异性。

Predicting DNA-binding specificities of eukaryotic transcription factors.

机构信息

Center for Bioinformatics Tübingen (ZBIT), University of Tübingen, Tübingen, Germany.

出版信息

PLoS One. 2010 Nov 30;5(11):e13876. doi: 10.1371/journal.pone.0013876.

DOI:10.1371/journal.pone.0013876
PMID:21152420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2994704/
Abstract

Today, annotated amino acid sequences of more and more transcription factors (TFs) are readily available. Quantitative information about their DNA-binding specificities, however, are hard to obtain. Position frequency matrices (PFMs), the most widely used models to represent binding specificities, are experimentally characterized only for a small fraction of all TFs. Even for some of the most intensively studied eukaryotic organisms (i.e., human, rat and mouse), roughly one-sixth of all proteins with annotated DNA-binding domain have been characterized experimentally. Here, we present a new method based on support vector regression for predicting quantitative DNA-binding specificities of TFs in different eukaryotic species. This approach estimates a quantitative measure for the PFM similarity of two proteins, based on various features derived from their protein sequences. The method is trained and tested on a dataset containing 1 239 TFs with known DNA-binding specificity, and used to predict specific DNA target motifs for 645 TFs with high accuracy.

摘要

如今,越来越多的转录因子 (TF) 的注释氨基酸序列都可以轻松获得。然而,关于它们的 DNA 结合特异性的定量信息却很难获取。位置频率矩阵 (PFM) 是最广泛用于表示结合特异性的模型,但仅对一小部分 TF 进行了实验表征。即使对于一些研究最深入的真核生物(即人类、大鼠和小鼠),也只有大约六分之一的具有注释 DNA 结合域的蛋白质进行了实验表征。在这里,我们提出了一种新的基于支持向量回归的方法,用于预测不同真核生物中 TF 的定量 DNA 结合特异性。该方法基于从蛋白质序列中提取的各种特征,估计两个蛋白质的 PFM 相似性的定量度量。该方法在包含 1239 个具有已知 DNA 结合特异性的 TF 的数据集上进行训练和测试,并用于高精度地预测 645 个具有高特异性的 TF 的特定 DNA 靶标基序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/81d7a162263a/pone.0013876.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/0eca08b04bf9/pone.0013876.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/4ae143afe8c3/pone.0013876.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/f6bef85a9333/pone.0013876.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/cc193dca0e12/pone.0013876.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/50893d6fabc2/pone.0013876.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/7c6845d40fd3/pone.0013876.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/81d7a162263a/pone.0013876.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/0eca08b04bf9/pone.0013876.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/4ae143afe8c3/pone.0013876.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/f6bef85a9333/pone.0013876.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/cc193dca0e12/pone.0013876.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/50893d6fabc2/pone.0013876.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/7c6845d40fd3/pone.0013876.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/465b/2994704/81d7a162263a/pone.0013876.g007.jpg

相似文献

1
Predicting DNA-binding specificities of eukaryotic transcription factors.预测真核转录因子的 DNA 结合特异性。
PLoS One. 2010 Nov 30;5(11):e13876. doi: 10.1371/journal.pone.0013876.
2
High resolution models of transcription factor-DNA affinities improve in vitro and in vivo binding predictions.转录因子-DNA 亲和力的高分辨率模型可改善体外和体内结合预测。
PLoS Comput Biol. 2010 Sep 9;6(9):e1000916. doi: 10.1371/journal.pcbi.1000916.
3
Prediction of TF target sites based on atomistic models of protein-DNA complexes.基于蛋白质-DNA复合物原子模型预测转录因子靶位点。
BMC Bioinformatics. 2008 Oct 16;9:436. doi: 10.1186/1471-2105-9-436.
4
Quantitative modeling of transcription factor binding specificities using DNA shape.利用DNA形状对转录因子结合特异性进行定量建模。
Proc Natl Acad Sci U S A. 2015 Apr 14;112(15):4654-9. doi: 10.1073/pnas.1422023112. Epub 2015 Mar 9.
5
Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.与重复DNA序列元件的非一致性蛋白质结合显著影响真核生物基因组。
PLoS Comput Biol. 2015 Aug 18;11(8):e1004429. doi: 10.1371/journal.pcbi.1004429. eCollection 2015 Aug.
6
DeepTFactor: A deep learning-based tool for the prediction of transcription factors.DeepTFactor:一种基于深度学习的转录因子预测工具。
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). doi: 10.1073/pnas.2021171118.
7
High-resolution DNA-binding specificity analysis of yeast transcription factors.酵母转录因子的高分辨率DNA结合特异性分析
Genome Res. 2009 Apr;19(4):556-66. doi: 10.1101/gr.090233.108. Epub 2009 Jan 21.
8
Predicting transcription factor binding motifs from DNA-binding domains, chromatin accessibility and gene expression data.从DNA结合结构域、染色质可及性和基因表达数据预测转录因子结合基序。
Nucleic Acids Res. 2017 Jun 2;45(10):5666-5677. doi: 10.1093/nar/gkx358.
9
FISim: a new similarity measure between transcription factor binding sites based on the fuzzy integral.FISim:一种基于模糊积分的转录因子结合位点间新的相似性度量方法。
BMC Bioinformatics. 2009 Jul 20;10:224. doi: 10.1186/1471-2105-10-224.
10
Evaluation of methods for modeling transcription factor sequence specificity.转录因子序列特异性建模方法评估。
Nat Biotechnol. 2013 Feb;31(2):126-34. doi: 10.1038/nbt.2486. Epub 2013 Jan 27.

引用本文的文献

1
Improved linking of motifs to their TFs using domain information.利用域信息改进基序与其 TF 的关联。
Bioinformatics. 2020 Mar 1;36(6):1655-1662. doi: 10.1093/bioinformatics/btz855.
2
Predicting transcription factor binding motifs from DNA-binding domains, chromatin accessibility and gene expression data.从DNA结合结构域、染色质可及性和基因表达数据预测转录因子结合基序。
Nucleic Acids Res. 2017 Jun 2;45(10):5666-5677. doi: 10.1093/nar/gkx358.
3
An overview of the prediction of protein DNA-binding sites.蛋白质DNA结合位点预测综述。

本文引用的文献

1
Metamotifs--a generative model for building families of nucleotide position weight matrices.Metamotifs--一种构建核苷酸位置权重矩阵家族的生成模型。
BMC Bioinformatics. 2010 Jun 25;11:348. doi: 10.1186/1471-2105-11-348.
2
Creating PWMs of transcription factors using 3D structure-based computation of protein-DNA free binding energies.使用基于 3D 结构的蛋白-DNA 自由结合能计算来创建转录因子的 PWMs。
BMC Bioinformatics. 2010 May 3;11:225. doi: 10.1186/1471-2105-11-225.
3
ModuleMaster: a new tool to decipher transcriptional regulatory networks.
Int J Mol Sci. 2015 Mar 6;16(3):5194-215. doi: 10.3390/ijms16035194.
4
TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.TFpredict 和 SABINE:基于序列的转录因子结构和功能特征预测。
PLoS One. 2013 Dec 12;8(12):e82238. doi: 10.1371/journal.pone.0082238. eCollection 2013.
5
Screening for protein-DNA interactions by automatable DNA-protein interaction ELISA.通过自动化 DNA-蛋白质相互作用 ELISA 筛选蛋白质-DNA 相互作用。
PLoS One. 2013 Oct 11;8(10):e75177. doi: 10.1371/journal.pone.0075177. eCollection 2013.
模块大师:一种用于解析转录调控网络的新工具。
Biosystems. 2010 Jan;99(1):79-81. doi: 10.1016/j.biosystems.2009.09.005. Epub 2009 Oct 9.
4
Diversity and complexity in DNA recognition by transcription factors.转录因子对DNA识别的多样性与复杂性
Science. 2009 Jun 26;324(5935):1720-3. doi: 10.1126/science.1162327. Epub 2009 May 14.
5
Sequence-based feature prediction and annotation of proteins.基于序列的蛋白质特征预测和注释。
Genome Biol. 2009 Feb 2;10(2):206. doi: 10.1186/gb-2009-10-2-206.
6
Jalview Version 2--a multiple sequence alignment editor and analysis workbench.Jalview 2版本——一个多序列比对编辑器和分析工作台。
Bioinformatics. 2009 May 1;25(9):1189-91. doi: 10.1093/bioinformatics/btp033. Epub 2009 Jan 16.
7
Predicting the binding preference of transcription factors to individual DNA k-mers.预测转录因子与单个DNA k聚体的结合偏好性。
Bioinformatics. 2009 Apr 15;25(8):1012-8. doi: 10.1093/bioinformatics/btn645. Epub 2008 Dec 16.
8
UniPROBE: an online database of protein binding microarray data on protein-DNA interactions.UniPROBE:一个关于蛋白质与DNA相互作用的蛋白质结合微阵列数据在线数据库。
Nucleic Acids Res. 2009 Jan;37(Database issue):D77-82. doi: 10.1093/nar/gkn660. Epub 2008 Oct 8.
9
Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules.使用RSAT扫描基因组序列以寻找转录因子结合位点和顺式调控模块。
Nat Protoc. 2008;3(10):1578-88. doi: 10.1038/nprot.2008.97.
10
BioJava: an open-source framework for bioinformatics.BioJava:一个用于生物信息学的开源框架。
Bioinformatics. 2008 Sep 15;24(18):2096-7. doi: 10.1093/bioinformatics/btn397. Epub 2008 Aug 8.