基于修饰核小体的转录因子结合位点预测

Transcription factor binding sites prediction based on modified nucleosomes.

作者信息

Talebzadeh Mohammad, Zare-Mirakabad Fatemeh

机构信息

Department of Mathematics and Computer Science, AmirKabir University of Technology, Tehran, Iran.

出版信息

PLoS One. 2014 Feb 21;9(2):e89226. doi: 10.1371/journal.pone.0089226. eCollection 2014.

DOI:10.1371/journal.pone.0089226

PMID:24586611

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3931712/

Abstract

In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, "modified nucleosomes neighboring" and "modified nucleosomes occupancy", to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction.

摘要

在计算方法中，位置权重矩阵（PWMs）通常用于转录因子结合位点（TFBS）预测。尽管这些矩阵在预测实际结合位点方面比简单的共有序列更准确，但它们通常会产生大量假阳性（FP）预测，因此是信息匮乏的来源。一些研究采用了其他信息来源，如序列保守性或与转录起始位点的距离，以区分真正的结合区域和随机区域。最近，已显示修饰核小体的空间分布与不同的启动子结构相关。这些排列模式可促进转录因子对DNA的可及性。我们假设使用这些排列和周期性模式的数据可以提高结合区域预测的性能。在本研究中，我们提出了两个有效特征，即“相邻修饰核小体”和“修饰核小体占有率”，以减少结合位点发现中的假阳性。基于这些特征，我们设计了一个逻辑回归分类器，用于估计一个区域作为TFBS的概率。我们的模型基于1号染色体上的Sp1结合位点学习每个特征，并在人类CD4+T细胞的其他染色体上进行了测试。在这项工作中，我们研究了21种组蛋白修饰，发现21种标记中只有8种与转录因子结合区域高度相关。为了证明这些特征并非Sp1所特有，我们将逻辑回归分类器与PWM相结合，并创建了一个新模型来搜索基因组上的TFBS。我们使用转录因子MAZ、PU.1和ELF1对该模型进行了测试，并将结果与仅使用PWM的结果进行了比较。结果表明，我们的模型能够更成功地预测转录因子结合区域。该模型相对简单且能够整合其他特征，使其成为TFBS预测的一种优越方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0762/3931712/bbc15be047ab/pone.0089226.g001.jpg

相似文献

Transcription factor binding sites prediction based on modified nucleosomes.

PLoS One. 2014 Feb 21;9(2):e89226. doi: 10.1371/journal.pone.0089226. eCollection 2014.

Nucleosome organization in the vicinity of transcription factor binding sites in the human genome.

BMC Genomics. 2014 Jun 19;15(1):493. doi: 10.1186/1471-2164-15-493.

Low nucleosome occupancy is encoded around functional human transcription factor binding sites.

BMC Genomics. 2008 Jul 15;9:332. doi: 10.1186/1471-2164-9-332.

Integrating genomic data to predict transcription factor binding.

Genome Inform. 2005;16(1):83-94.

The spatial distribution of cis regulatory elements in yeast promoters and its implications for transcriptional regulation.

BMC Genomics. 2010 Oct 19;11:581. doi: 10.1186/1471-2164-11-581.

Optimized position weight matrices in prediction of novel putative binding sites for transcription factors in the Drosophila melanogaster genome.

PLoS One. 2013 Aug 6;8(8):e68712. doi: 10.1371/journal.pone.0068712. Print 2013.

Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression.

BMC Genomics. 2004 Feb 23;5(1):16. doi: 10.1186/1471-2164-5-16.

Predicting transcription factor site occupancy using DNA sequence intrinsic and cell-type specific chromatin features.

BMC Bioinformatics. 2016 Jan 11;17 Suppl 1(Suppl 1):4. doi: 10.1186/s12859-015-0846-z.

A DNA shape-based regulatory score improves position-weight matrix-based recognition of transcription factor binding sites.

Bioinformatics. 2015 Nov 1;31(21):3445-50. doi: 10.1093/bioinformatics/btv391. Epub 2015 Jun 30.

Identifying cooperative transcription factors in yeast using multiple data sources.

BMC Syst Biol. 2014;8 Suppl 5(Suppl 5):S2. doi: 10.1186/1752-0509-8-S5-S2. Epub 2014 Dec 12.

引用本文的文献

A post-GWAS confirming the genetic effects and functional polymorphisms of AGPAT3 gene on milk fatty acids in dairy cattle.

J Anim Sci Biotechnol. 2021 Feb 1;12(1):24. doi: 10.1186/s40104-020-00540-4.

Identification of single nucleotide polymorphisms of and genes and their genetic associations with milk production traits in dairy cows.

J Anim Sci Biotechnol. 2019 Nov 6;10:81. doi: 10.1186/s40104-019-0392-z. eCollection 2019.

Discovering human transcription factor physical interactions with genetic variants, novel DNA motifs, and repetitive elements using enhanced yeast one-hybrid assays.

Genome Res. 2019 Sep;29(9):1533-1544. doi: 10.1101/gr.248823.119.

Cross-Cell-Type Prediction of TF-Binding Site by Integrating Convolutional Neural Network and Adversarial Network.

Int J Mol Sci. 2019 Jul 12;20(14):3425. doi: 10.3390/ijms20143425.

Single Nucleotide Polymorphisms of and their Genetic Associations with Milk Production Traits in Dairy Cows.

Genes (Basel). 2019 Jun 13;10(6):449. doi: 10.3390/genes10060449.

A comprehensive review of computational prediction of genome-wide features.

Brief Bioinform. 2020 Jan 17;21(1):120-134. doi: 10.1093/bib/bby110.

Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.

PLoS Comput Biol. 2015 Aug 20;11(8):e1004418. doi: 10.1371/journal.pcbi.1004418. eCollection 2015 Aug.

本文引用的文献

Epigenetic priors for identifying active transcription factor binding sites.

Bioinformatics. 2012 Jan 1;28(1):56-62. doi: 10.1093/bioinformatics/btr614. Epub 2011 Nov 8.

Tight associations between transcription promoter type and epigenetic variation in histone positioning and modification.

BMC Genomics. 2011 Aug 17;12:416. doi: 10.1186/1471-2164-12-416.

The analysis of ChIP-Seq data.

Methods Enzymol. 2011;497:51-73. doi: 10.1016/B978-0-12-385075-1.00003-2.

Histone modification profiles are predictive for tissue/cell-type specific expression of both protein-coding and microRNA genes.

BMC Bioinformatics. 2011 May 14;12:155. doi: 10.1186/1471-2105-12-155.

Computational analysis of ChIP-seq data.

Methods Mol Biol. 2010;674:143-59. doi: 10.1007/978-1-60761-854-6_9.

Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites.

Bioinformatics. 2010 Sep 1;26(17):2071-5. doi: 10.1093/bioinformatics/btq405. Epub 2010 Jul 27.

Core promoter structure and genomic context reflect histone 3 lysine 9 acetylation patterns.

BMC Genomics. 2010 Apr 21;11:257. doi: 10.1186/1471-2164-11-257.

Integrating multiple evidence sources to predict transcription factor binding in the human genome.

Genome Res. 2010 Apr;20(4):526-36. doi: 10.1101/gr.096305.109. Epub 2010 Mar 10.

Histone modification levels are predictive for gene expression.

Proc Natl Acad Sci U S A. 2010 Feb 16;107(7):2926-31. doi: 10.1073/pnas.0909344107. Epub 2010 Feb 1.

Genome-wide prediction of transcription factor binding sites using an integrated model.

Genome Biol. 2010 Jan 22;11(1):R7. doi: 10.1186/gb-2010-11-1-r7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于修饰核小体的转录因子结合位点预测

Transcription factor binding sites prediction based on modified nucleosomes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献