Suppr超能文献

从头预测 Cys2His2 锌指蛋白的 DNA 结合特异性。

De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins.

机构信息

Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ 08544, USA and Department of Computer Science, Princeton University, Princeton NJ 08544, USA.

出版信息

Nucleic Acids Res. 2014 Jan;42(1):97-108. doi: 10.1093/nar/gkt890. Epub 2013 Oct 3.

Abstract

Proteins with sequence-specific DNA binding function are important for a wide range of biological activities. De novo prediction of their DNA-binding specificities from sequence alone would be a great aid in inferring cellular networks. Here we introduce a method for predicting DNA-binding specificities for Cys2His2 zinc fingers (C2H2-ZFs), the largest family of DNA-binding proteins in metazoans. We develop a general approach, based on empirical calculations of pairwise amino acid-nucleotide interaction energies, for predicting position weight matrices (PWMs) representing DNA-binding specificities for C2H2-ZF proteins. We predict DNA-binding specificities on a per-finger basis and merge predictions for C2H2-ZF domains that are arrayed within sequences. We test our approach on a diverse set of natural C2H2-ZF proteins with known binding specificities and demonstrate that for >85% of the proteins, their predicted PWMs are accurate in 50% of their nucleotide positions. For proteins with several zinc finger isoforms, we show via case studies that this level of accuracy enables us to match isoforms with their known DNA-binding specificities. A web server for predicting a PWM given a protein containing C2H2-ZF domains is available online at http://zf.princeton.edu and can be used to aid in protein engineering applications and in genome-wide searches for transcription factor targets.

摘要

具有序列特异性 DNA 结合功能的蛋白质对于广泛的生物活性非常重要。仅从序列中预测其 DNA 结合特异性将极大地有助于推断细胞网络。在这里,我们介绍了一种预测 Cys2His2 锌指(C2H2-ZF)的 DNA 结合特异性的方法,C2H2-ZF 是后生动物中最大的 DNA 结合蛋白家族。我们开发了一种基于经验计算氨基酸-核苷酸相互作用能的通用方法,用于预测代表 C2H2-ZF 蛋白 DNA 结合特异性的位置权重矩阵(PWM)。我们基于每个手指进行 DNA 结合特异性预测,并合并排列在序列中的 C2H2-ZF 结构域的预测。我们在一组具有已知结合特异性的多样化天然 C2H2-ZF 蛋白质上测试了我们的方法,并证明对于 >85%的蛋白质,其预测的 PWM 在 50%的核苷酸位置是准确的。对于具有多个锌指同工型的蛋白质,我们通过案例研究表明,这种准确性水平使我们能够将同工型与其已知的 DNA 结合特异性相匹配。一个用于预测给定含有 C2H2-ZF 结构域的蛋白质的 PWM 的网络服务器可在 http://zf.princeton.edu 上在线获得,并可用于辅助蛋白质工程应用和在全基因组范围内搜索转录因子靶标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd56/3874201/45926c4e68e2/gkt890f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验