• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PDNAsite:通过整合空间和序列上下文从蛋白质序列中识别 DNA 结合位点。

PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China.

Department of Computing, the Hong Kong Polytechnic University, Hong Kong.

出版信息

Sci Rep. 2016 Jun 10;6:27653. doi: 10.1038/srep27653.

DOI:10.1038/srep27653
PMID:27282833
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4901350/
Abstract

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community.

摘要

蛋白质与 DNA 的相互作用涉及许多对细胞功能至关重要的基本生物学过程。现有的大多数计算方法仅使用目标残基的序列上下文进行预测。在本研究中,对于每个目标残基,我们同时应用空间上下文和序列上下文来构建特征空间。随后,应用潜在语义分析(LSA)来去除特征空间中的冗余信息。最后,通过集成支持向量机(SVM)分类器和集成学习,开发了一个预测器(PDNAsite)。在 PDNA-62 和 PDNA-224 数据集上的结果表明,从空间上下文提取的特征比从序列上下文提取的特征提供了更多的信息,而将它们结合起来则可以获得更好的性能提升。对目标位点空间上下文的结合位点数量的分析表明,相邻结合位点之间的相互作用对于蛋白质-DNA 识别及其结合能力非常重要。与现有方法相比,我们提出的 PDNAsite 方法优于大多数现有方法,是一种用于 DNA 结合位点识别的有用工具。我们的预测器的 Web 服务器(http://hlt.hitsz.edu.cn:8080/PDNAsite/)可供生物研究社区免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/b870aaea68a5/srep27653-f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/273bf1960f66/srep27653-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ee05e924924b/srep27653-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/82c448fbb628/srep27653-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/71074f939146/srep27653-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ff87c40e7f78/srep27653-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ce46cd268e5b/srep27653-f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/178236eac85a/srep27653-f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/d4c2406a2900/srep27653-f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/b870aaea68a5/srep27653-f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/273bf1960f66/srep27653-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ee05e924924b/srep27653-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/82c448fbb628/srep27653-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/71074f939146/srep27653-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ff87c40e7f78/srep27653-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/ce46cd268e5b/srep27653-f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/178236eac85a/srep27653-f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/d4c2406a2900/srep27653-f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f04/4901350/b870aaea68a5/srep27653-f9.jpg

相似文献

1
PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.PDNAsite:通过整合空间和序列上下文从蛋白质序列中识别 DNA 结合位点。
Sci Rep. 2016 Jun 10;6:27653. doi: 10.1038/srep27653.
2
EL_PSSM-RT: DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation.EL_PSSM-RT:通过整合集成学习与PSSM关系转换进行DNA结合残基预测
BMC Bioinformatics. 2017 Aug 29;18(1):379. doi: 10.1186/s12859-017-1792-8.
3
Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation.通过结合支持向量机和位置特异性得分矩阵距离变换来识别DNA结合蛋白。
BMC Syst Biol. 2015;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1752-0509-9-S1-S10. Epub 2015 Feb 6.
4
Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information.基于序列的具有保守性和相关性信息的蛋白质 DNA 结合残基预测。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1766-75. doi: 10.1109/TCBB.2012.106.
5
BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences.BindN:一种用于高效预测氨基酸序列中DNA和RNA结合位点的基于网络的工具。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W243-8. doi: 10.1093/nar/gkl298.
6
Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins.利用进化和结构信息预测DNA结合蛋白上的DNA结合位点。
Proteins. 2006 Jul 1;64(1):19-27. doi: 10.1002/prot.20977.
7
PseDNA-Pro: DNA-Binding Protein Identification by Combining Chou's PseAAC and Physicochemical Distance Transformation.PseDNA-Pro:结合周氏伪氨基酸组成和物理化学距离变换的DNA结合蛋白鉴定方法
Mol Inform. 2015 Jan;34(1):8-17. doi: 10.1002/minf.201400025. Epub 2014 Sep 26.
8
DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation.通过结合伪氨基酸组成和基于轮廓的蛋白质表示来鉴定DNA结合蛋白
Sci Rep. 2015 Oct 20;5:15479. doi: 10.1038/srep15479.
9
DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins.DP-Bind:一个用于基于序列预测DNA结合蛋白中DNA结合残基的网络服务器。
Bioinformatics. 2007 Mar 1;23(5):634-6. doi: 10.1093/bioinformatics/btl672. Epub 2007 Jan 19.
10
Prediction of DNA-binding residues from protein sequence information using random forests.利用随机森林从蛋白质序列信息预测DNA结合残基。
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-10-S1-S1.

引用本文的文献

1
Advances in Language-Model-Informed Protein-Nucleic Acid Binding Site Prediction.基于语言模型的蛋白质-核酸结合位点预测研究进展
Methods Mol Biol. 2025;2941:139-151. doi: 10.1007/978-1-0716-4623-6_9.
2
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.蛋白质中心预测因子在生物分子相互作用研究中的综合综述:从蛋白质到核酸及其他。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.
3
Multiple protein-DNA interfaces unravelled by evolutionary information, physico-chemical and geometrical properties.

本文引用的文献

1
Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation.通过结合支持向量机和位置特异性得分矩阵距离变换来识别DNA结合蛋白。
BMC Syst Biol. 2015;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1752-0509-9-S1-S10. Epub 2015 Feb 6.
2
Identification of DNA-binding proteins by incorporating evolutionary information into pseudo amino acid composition via the top-n-gram approach.通过 top-n-gram 方法将进化信息纳入伪氨基酸组成,从而鉴定 DNA 结合蛋白。
J Biomol Struct Dyn. 2015;33(8):1720-30. doi: 10.1080/07391102.2014.968624. Epub 2014 Oct 28.
3
enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.
通过进化信息、物理化学和几何性质揭示多个蛋白质-DNA 界面。
PLoS Comput Biol. 2020 Feb 3;16(2):e1007624. doi: 10.1371/journal.pcbi.1007624. eCollection 2020 Feb.
4
Cross-Cell-Type Prediction of TF-Binding Site by Integrating Convolutional Neural Network and Adversarial Network.基于卷积神经网络和对抗网络的跨细胞类型预测 TF 结合位点
Int J Mol Sci. 2019 Jul 12;20(14):3425. doi: 10.3390/ijms20143425.
5
CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway.CNN_H_PSS:基于卷积神经网络和高速公路的 8 类蛋白质二级结构预测。
BMC Bioinformatics. 2018 May 8;19(Suppl 4):60. doi: 10.1186/s12859-018-2067-8.
6
3DCONS-DB: A Database of Position-Specific Scoring Matrices in Protein Structures.3DCONS-DB:蛋白质结构中位置特异性评分矩阵数据库。
Molecules. 2017 Dec 15;22(12):2230. doi: 10.3390/molecules22122230.
enDNA-Prot:通过应用集成学习识别DNA结合蛋白。
Biomed Res Int. 2014;2014:294279. doi: 10.1155/2014/294279. Epub 2014 May 26.
4
DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches.DNABind:一种基于机器学习和模板的混合算法,用于预测基于结构的 DNA 结合残基。
Proteins. 2013 Nov;81(11):1885-99. doi: 10.1002/prot.24330. Epub 2013 Aug 16.
5
PreDNA: accurate prediction of DNA-binding sites in proteins by integrating sequence and geometric structure information.PreDNA:通过整合序列和几何结构信息来准确预测蛋白质中的 DNA 结合位点。
Bioinformatics. 2013 Mar 15;29(6):678-85. doi: 10.1093/bioinformatics/btt029. Epub 2013 Jan 17.
6
Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information.基于序列的具有保守性和相关性信息的蛋白质 DNA 结合残基预测。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1766-75. doi: 10.1109/TCBB.2012.106.
7
DR_bind: a web server for predicting DNA-binding residues from the protein structure based on electrostatics, evolution and geometry.DR_bind:一个基于静电、进化和几何的从蛋白质结构预测 DNA 结合残基的网络服务器。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W249-56. doi: 10.1093/nar/gks481. Epub 2012 May 31.
8
Prediction of lysine ubiquitylation with ensemble classifier and feature selection.基于集成分类器和特征选择的赖氨酸泛素化预测
Int J Mol Sci. 2011;12(12):8347-61. doi: 10.3390/ijms12128347. Epub 2011 Nov 28.
9
Computational prediction of heme-binding residues by exploiting residue interaction network.利用残基相互作用网络计算预测血红素结合残基。
PLoS One. 2011;6(10):e25560. doi: 10.1371/journal.pone.0025560. Epub 2011 Oct 3.
10
BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features.BindN+ 用于从蛋白质序列特征准确预测DNA和RNA结合残基。
BMC Syst Biol. 2010 May 28;4 Suppl 1(Suppl 1):S3. doi: 10.1186/1752-0509-4-S1-S3.