Suppr超能文献

优化位置权重矩阵以预测果蝇基因组中转录因子的新型潜在结合位点。

Optimized position weight matrices in prediction of novel putative binding sites for transcription factors in the Drosophila melanogaster genome.

机构信息

Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada.

出版信息

PLoS One. 2013 Aug 6;8(8):e68712. doi: 10.1371/journal.pone.0068712. Print 2013.

Abstract

Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. DNA-binding proteins often show degeneracy in their binding requirement and thus the overall binding specificity of many proteins is unknown and remains an active area of research. Although existing PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. Our previous study introduced a promising approach to PWM refinement in which known motifs are used to computationally mine putative binding sites directly from aligned promoter regions using composition of similar sites. In the present study, we extended this technique originally tested on single examples of transcription factors (TFs) and showed its capability to optimize PWM performance to predict new binding sites in the fruit fly genome. We propose refined PWMs in mono- and dinucleotide versions similarly computed for a large variety of transcription factors of Drosophila melanogaster. Along with the addition of many auxiliary sites the optimization includes variation of the PWM motif length, the binding sites location on the promoters and the PWM score threshold. To assess the predictive performance of the refined PWMs we compared them to conventional TRANSFAC and JASPAR sources. The results have been verified using performed tests and literature review. Overall, the refined PWMs containing putative sites derived from real promoter content processed using optimized parameters had better general accuracy than conventional PWMs.

摘要

位置权重矩阵 (PWMs) 已成为识别 DNA 序列中转录因子结合位点的首选工具。DNA 结合蛋白在其结合要求上通常表现出简并性,因此许多蛋白质的整体结合特异性尚不清楚,仍是一个活跃的研究领域。尽管现有的 PWM 比共识字符串匹配更可靠的预测器,但它们通常会导致大量的假阳性命中。我们之前的研究提出了一种有前途的 PWM 细化方法,该方法使用已知的基序直接从对齐的启动子区域中使用相似基序的组合来计算假定的结合位点。在本研究中,我们扩展了该技术,最初在单个转录因子 (TF) 实例上进行了测试,并展示了其优化 PWM 性能以预测果蝇基因组中新的结合位点的能力。我们提出了类似计算的单核苷酸和二核苷酸版本的细化 PWM,适用于大量黑腹果蝇的转录因子。除了添加许多辅助位点外,优化还包括 PWM 基序长度、启动子上的结合位点位置和 PWM 评分阈值的变化。为了评估细化 PWM 的预测性能,我们将它们与传统的 TRANSFAC 和 JASPAR 来源进行了比较。使用进行的测试和文献综述验证了结果。总体而言,使用优化参数处理的来自真实启动子内容的细化 PWM 包含的假定位点具有比传统 PWM 更好的一般准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/3735551/3f9cc54f01c8/pone.0068712.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验