• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于随机森林的灵活综合方法提高了转录因子结合位点的预测能力。

A flexible integrative approach based on random forest improves prediction of transcription factor binding sites.

机构信息

Department of Biomedical Molecular Biology, Ghent University, B-9052 Ghent, Belgium.

出版信息

Nucleic Acids Res. 2012 Aug;40(14):e106. doi: 10.1093/nar/gks283. Epub 2012 Apr 5.

DOI:10.1093/nar/gks283
PMID:22492513
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3413102/
Abstract

Transcription factor binding sites (TFBSs) are DNA sequences of 6-15 base pairs. Interaction of these TFBSs with transcription factors (TFs) is largely responsible for most spatiotemporal gene expression patterns. Here, we evaluate to what extent sequence-based prediction of TFBSs can be improved by taking into account the positional dependencies of nucleotides (NPDs) and the nucleotide sequence-dependent structure of DNA. We make use of the random forest algorithm to flexibly exploit both types of information. Results in this study show that both the structural method and the NPD method can be valuable for the prediction of TFBSs. Moreover, their predictive values seem to be complementary, even to the widely used position weight matrix (PWM) method. This led us to combine all three methods. Results obtained for five eukaryotic TFs with different DNA-binding domains show that our method improves classification accuracy for all five eukaryotic TFs compared with other approaches. Additionally, we contrast the results of seven smaller prokaryotic sets with high-quality data and show that with the use of high-quality data we can significantly improve prediction performance. Models developed in this study can be of great use for gaining insight into the mechanisms of TF binding.

摘要

转录因子结合位点 (TFBSs) 是 6-15 个碱基对的 DNA 序列。这些 TFBSs 与转录因子 (TFs) 的相互作用在很大程度上负责大多数时空基因表达模式。在这里,我们评估通过考虑核苷酸的位置依赖性 (NPDs) 和 DNA 的核苷酸序列依赖性结构,基于序列的 TFBSs 预测可以在多大程度上得到改进。我们利用随机森林算法灵活地利用这两种类型的信息。本研究的结果表明,结构方法和 NPD 方法都可以对 TFBSs 的预测有价值。此外,它们的预测值似乎是互补的,甚至与广泛使用的位置权重矩阵 (PWM) 方法也是互补的。这促使我们将这三种方法结合起来。对于具有不同 DNA 结合域的五个真核 TF 的结果表明,与其他方法相比,我们的方法提高了所有五个真核 TF 的分类准确性。此外,我们对比了具有高质量数据的七个较小的原核集合的结果,并表明通过使用高质量数据,我们可以显著提高预测性能。本研究中开发的模型对于深入了解 TF 结合的机制非常有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/bbd079ff8b76/gks283f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/5a210e906d45/gks283f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/64e03c339795/gks283f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/2febb05388f3/gks283f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/bbd079ff8b76/gks283f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/5a210e906d45/gks283f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/64e03c339795/gks283f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/2febb05388f3/gks283f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74a0/3413102/bbd079ff8b76/gks283f4.jpg

相似文献

1
A flexible integrative approach based on random forest improves prediction of transcription factor binding sites.基于随机森林的灵活综合方法提高了转录因子结合位点的预测能力。
Nucleic Acids Res. 2012 Aug;40(14):e106. doi: 10.1093/nar/gks283. Epub 2012 Apr 5.
2
A novel method for improved accuracy of transcription factor binding site prediction.一种提高转录因子结合位点预测准确性的新方法。
Nucleic Acids Res. 2018 Jul 6;46(12):e72. doi: 10.1093/nar/gky237.
3
An efficient algorithm for improving structure-based prediction of transcription factor binding sites.一种用于改进基于结构的转录因子结合位点预测的高效算法。
BMC Bioinformatics. 2017 Jul 17;18(1):342. doi: 10.1186/s12859-017-1755-0.
4
LASAGNA: a novel algorithm for transcription factor binding site alignment.LASAGNA:一种用于转录因子结合位点比对的新算法。
BMC Bioinformatics. 2013 Mar 24;14:108. doi: 10.1186/1471-2105-14-108.
5
An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs.一种基于直觉的方法,用于对 DNA 序列进行评分,以对抗转录因子结合位点基序。
BMC Bioinformatics. 2010 Nov 8;11:551. doi: 10.1186/1471-2105-11-551.
6
Metamotifs--a generative model for building families of nucleotide position weight matrices.Metamotifs--一种构建核苷酸位置权重矩阵家族的生成模型。
BMC Bioinformatics. 2010 Jun 25;11:348. doi: 10.1186/1471-2105-11-348.
7
A general pairwise interaction model provides an accurate description of in vivo transcription factor binding sites.一种通用的成对相互作用模型能够准确描述体内转录因子结合位点。
PLoS One. 2014 Jun 13;9(6):e99015. doi: 10.1371/journal.pone.0099015. eCollection 2014.
8
Most of the tight positional conservation of transcription factor binding sites near the transcription start site reflects their co-localization within regulatory modules.转录起始位点附近转录因子结合位点的大多数紧密位置保守性反映了它们在调控模块内的共定位。
BMC Bioinformatics. 2016 Nov 21;17(1):479. doi: 10.1186/s12859-016-1354-5.
9
A DNA shape-based regulatory score improves position-weight matrix-based recognition of transcription factor binding sites.一种基于DNA形状的调控评分提高了基于位置权重矩阵对转录因子结合位点的识别。
Bioinformatics. 2015 Nov 1;31(21):3445-50. doi: 10.1093/bioinformatics/btv391. Epub 2015 Jun 30.
10
A structural-based strategy for recognition of transcription factor binding sites.基于结构的转录因子结合位点识别策略。
PLoS One. 2013;8(1):e52460. doi: 10.1371/journal.pone.0052460. Epub 2013 Jan 8.

引用本文的文献

1
BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning.BERT-TFBS:一种基于迁移学习的用于预测转录因子结合位点的新型基于BERT的模型。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae195.
2
Novel Grade Classification Tool with Lipidomics for Indica Rice Eating Quality Evaluation.基于脂质组学的新型籼米食用品质评价等级分类工具
Foods. 2023 Feb 23;12(5):944. doi: 10.3390/foods12050944.
3
Construction of a Diagnostic Model for Lymph Node Metastasis of the Papillary Thyroid Carcinoma Using Preoperative Ultrasound Features and Imaging Omics.

本文引用的文献

1
Efficient double fragmentation ChIP-seq provides nucleotide resolution protein-DNA binding profiles.高效双断裂 ChIP-seq 提供核苷酸分辨率的蛋白-DNA 结合谱。
PLoS One. 2010 Nov 30;5(11):e15092. doi: 10.1371/journal.pone.0015092.
2
Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.利用 DNA 的序列特异性化学和结构特性来预测转录因子结合位点。
PLoS Comput Biol. 2010 Nov 18;6(11):e1001007. doi: 10.1371/journal.pcbi.1001007.
3
Use of structural DNA properties for the prediction of transcription-factor binding sites in Escherichia coli.
利用术前超声特征和影像学组学构建甲状腺乳头状癌淋巴结转移的诊断模型。
J Healthc Eng. 2022 Feb 8;2022:1872412. doi: 10.1155/2022/1872412. eCollection 2022.
4
iT4SE-EP: Accurate Identification of Bacterial Type IV Secreted Effectors by Exploring Evolutionary Features from Two PSI-BLAST Profiles.iT4SE-EP:通过探索来自两个PSI-BLAST图谱的进化特征准确鉴定细菌IV型分泌效应蛋白
Molecules. 2021 Apr 24;26(9):2487. doi: 10.3390/molecules26092487.
5
DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks.DeepGRN:基于注意力机制的深度神经网络跨细胞类型预测转录因子结合位点
BMC Bioinformatics. 2021 Feb 1;22(1):38. doi: 10.1186/s12859-020-03952-1.
6
Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities.用于整合生物学和医学数据的机器学习:原理、实践与机遇
Inf Fusion. 2019 Oct;50:71-91. doi: 10.1016/j.inffus.2018.09.012. Epub 2018 Sep 21.
7
Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.通过高通量测序绘制RNA-蛋白质相互作用的特异性图谱。
Methods. 2017 Apr 15;118-119:111-118. doi: 10.1016/j.ymeth.2017.03.002. Epub 2017 Mar 2.
8
Quantitative modeling of gene expression using DNA shape features of binding sites.利用结合位点的DNA形状特征对基因表达进行定量建模。
Nucleic Acids Res. 2016 Jul 27;44(13):e120. doi: 10.1093/nar/gkw446. Epub 2016 Jun 1.
9
Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.序列基序、染色质状态和DNA结构特征对酵母转录因子结合预测模型的贡献
PLoS Comput Biol. 2015 Aug 20;11(8):e1004418. doi: 10.1371/journal.pcbi.1004418. eCollection 2015 Aug.
10
Specificity and nonspecificity in RNA-protein interactions.RNA-蛋白质相互作用中的特异性与非特异性
Nat Rev Mol Cell Biol. 2015 Sep;16(9):533-44. doi: 10.1038/nrm4032. Epub 2015 Aug 19.
利用结构 DNA 性质预测大肠杆菌中的转录因子结合位点。
Nucleic Acids Res. 2011 Jan;39(2):e6. doi: 10.1093/nar/gkq1071. Epub 2010 Nov 4.
4
Theoretical and empirical quality assessment of transcription factor-binding motifs.转录因子结合基序的理论和经验质量评估。
Nucleic Acids Res. 2011 Feb;39(3):808-24. doi: 10.1093/nar/gkq710. Epub 2010 Oct 4.
5
Structure of the LexA-DNA complex and implications for SOS box measurement.LexA-DNA 复合物的结构及其对 SOS 盒测量的影响。
Nature. 2010 Aug 12;466(7308):883-6. doi: 10.1038/nature09200.
6
Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites.全基因组组蛋白乙酰化数据可提高哺乳动物转录因子结合位点预测的准确性。
Bioinformatics. 2010 Sep 1;26(17):2071-5. doi: 10.1093/bioinformatics/btq405. Epub 2010 Jul 27.
7
Origins of specificity in protein-DNA recognition.蛋白质与 DNA 识别特异性的起源。
Annu Rev Biochem. 2010;79:233-69. doi: 10.1146/annurev-biochem-060408-091030.
8
Localized motif discovery in gene regulatory sequences.基因调控序列中的局部模体发现。
Bioinformatics. 2010 May 1;26(9):1152-9. doi: 10.1093/bioinformatics/btq106. Epub 2010 Mar 11.
9
Integrating multiple evidence sources to predict transcription factor binding in the human genome.整合多个证据来源以预测人类基因组中的转录因子结合
Genome Res. 2010 Apr;20(4):526-36. doi: 10.1101/gr.096305.109. Epub 2010 Mar 10.
10
On the detection and refinement of transcription factor binding sites using ChIP-Seq data.利用 ChIP-Seq 数据检测和改进转录因子结合位点。
Nucleic Acids Res. 2010 Apr;38(7):2154-67. doi: 10.1093/nar/gkp1180. Epub 2010 Jan 6.