Suppr超能文献

一种基于DNA形状的调控评分提高了基于位置权重矩阵对转录因子结合位点的识别。

A DNA shape-based regulatory score improves position-weight matrix-based recognition of transcription factor binding sites.

作者信息

Yang Jichen, Ramsey Stephen A

机构信息

Department of Biomedical Sciences and.

Department of Biomedical Sciences and School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA.

出版信息

Bioinformatics. 2015 Nov 1;31(21):3445-50. doi: 10.1093/bioinformatics/btv391. Epub 2015 Jun 30.

Abstract

MOTIVATION

The position-weight matrix (PWM) is a useful representation of a transcription factor binding site (TFBS) sequence pattern because the PWM can be estimated from a small number of representative TFBS sequences. However, because the PWM probability model assumes independence between individual nucleotide positions, the PWMs for some TFs poorly discriminate binding sites from non-binding-sites that have similar sequence content. Since the local three-dimensional DNA structure ('shape') is a determinant of TF binding specificity and since DNA shape has a significant sequence-dependence, we combined DNA shape-derived features into a TF-generalized regulatory score and tested whether the score could improve PWM-based discrimination of TFBS from non-binding-sites.

RESULTS

We compared a traditional PWM model to a model that combines the PWM with a DNA shape feature-based regulatory potential score, for accuracy in detecting binding sites for 75 vertebrate transcription factors. The PWM+shape model was more accurate than the PWM-only model, for 45% of TFs tested, with no significant loss of accuracy for the remaining TFs.

AVAILABILITY AND IMPLEMENTATION

The shape-based model is available as an open-source R package at that is archived on the GitHub software repository at https://github.com/ramseylab/regshape/.

CONTACT

stephen.ramsey@oregonstate.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

位置权重矩阵(PWM)是转录因子结合位点(TFBS)序列模式的一种有用表示形式,因为PWM可以从少量代表性TFBS序列中估计出来。然而,由于PWM概率模型假设各个核苷酸位置之间相互独立,某些转录因子的PWM难以区分具有相似序列内容的结合位点和非结合位点。由于局部三维DNA结构(“形状”)是TF结合特异性的决定因素,并且由于DNA形状具有显著的序列依赖性,我们将基于DNA形状的特征组合成一个TF通用调控得分,并测试该得分是否可以提高基于PWM的TFBS与非结合位点的区分能力。

结果

我们将传统的PWM模型与一个将PWM与基于DNA形状特征的调控潜力得分相结合的模型进行了比较,以检测75种脊椎动物转录因子结合位点的准确性。对于45%的测试转录因子,PWM+形状模型比仅使用PWM的模型更准确,其余转录因子的准确性没有显著损失。

可用性和实现方式

基于形状的模型作为一个开源R包提供,存档于GitHub软件仓库(https://github.com/ramseylab/regshape/)。

联系方式

stephen.ramsey@oregonstate.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

引用本文的文献

2
Predicting DNA structure using a deep learning method.使用深度学习方法预测 DNA 结构。
Nat Commun. 2024 Feb 9;15(1):1243. doi: 10.1038/s41467-024-45191-5.
5
Landscape of transcriptional deregulation in lung cancer.肺癌中转录失调的全景。
BMC Genomics. 2018 Jun 5;19(1):435. doi: 10.1186/s12864-018-4828-1.

本文引用的文献

5

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验