• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过 top-n-gram 方法将进化信息纳入伪氨基酸组成,从而鉴定 DNA 结合蛋白。

Identification of DNA-binding proteins by incorporating evolutionary information into pseudo amino acid composition via the top-n-gram approach.

机构信息

a School of Computer Science and Technology , Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town , Xili, Shenzhen 518055 , Guangdong , China.

出版信息

J Biomol Struct Dyn. 2015;33(8):1720-30. doi: 10.1080/07391102.2014.968624. Epub 2014 Oct 28.

DOI:10.1080/07391102.2014.968624
PMID:25252709
Abstract

DNA-binding proteins are crucial for various cellular processes and hence have become an important target for both basic research and drug development. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to establish an automated method for rapidly and accurately identifying DNA-binding proteins based on their sequence information alone. Owing to the fact that all biological species have developed beginning from a very limited number of ancestral species, it is important to take into account the evolutionary information in developing such a high-throughput tool. In view of this, a new predictor was proposed by incorporating the evolutionary information into the general form of pseudo amino acid composition via the top-n-gram approach. It was observed by comparing the new predictor with the existing methods via both jackknife test and independent data-set test that the new predictor outperformed its counterparts. It is anticipated that the new predictor may become a useful vehicle for identifying DNA-binding proteins. It has not escaped our notice that the novel approach to extract evolutionary information into the formulation of statistical samples can be used to identify many other protein attributes as well.

摘要

DNA 结合蛋白对于各种细胞过程至关重要,因此已成为基础研究和药物开发的重要目标。在后基因组时代,随着蛋白质序列的大量涌现,人们非常希望能够建立一种基于序列信息的自动化方法,以便快速准确地识别 DNA 结合蛋白。由于所有生物物种都是从非常有限的祖先进化而来的,因此在开发这种高通量工具时,考虑进化信息非常重要。有鉴于此,通过 top-n-gram 方法将进化信息纳入伪氨基酸组成的通用形式,提出了一种新的预测器。通过 Jackknife 测试和独立数据集测试将新的预测器与现有方法进行比较,观察到新的预测器优于其对应物。预计新的预测器可能成为识别 DNA 结合蛋白的有用工具。我们注意到,将进化信息提取到统计样本公式中的新方法也可以用于识别许多其他蛋白质属性。

相似文献

1
Identification of DNA-binding proteins by incorporating evolutionary information into pseudo amino acid composition via the top-n-gram approach.通过 top-n-gram 方法将进化信息纳入伪氨基酸组成,从而鉴定 DNA 结合蛋白。
J Biomol Struct Dyn. 2015;33(8):1720-30. doi: 10.1080/07391102.2014.968624. Epub 2014 Oct 28.
2
Using the concept of Chou's pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von Neumann entropies.利用周氏伪氨基酸组成概念预测蛋白质亚细胞定位:一种融合进化信息和冯·诺依曼熵的方法
Amino Acids. 2008 May;34(4):565-72. doi: 10.1007/s00726-007-0010-9. Epub 2007 Dec 11.
3
iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.iDNA-Prot|dis:通过将氨基酸距离对和简化字母表概况纳入通用伪氨基酸组成来鉴定DNA结合蛋白。
PLoS One. 2014 Sep 3;9(9):e106691. doi: 10.1371/journal.pone.0106691. eCollection 2014.
4
Discriminating bioluminescent proteins by incorporating average chemical shift and evolutionary information into the general form of Chou's pseudo amino acid composition.通过将平均化学位移和进化信息纳入周的伪氨基酸组成的通用形式来区分生物发光蛋白。
J Theor Biol. 2013 Oct 7;334:45-51. doi: 10.1016/j.jtbi.2013.06.003. Epub 2013 Jun 13.
5
Predicting DNA-binding proteins: approached from Chou's pseudo amino acid composition and other specific sequence features.预测DNA结合蛋白:基于周的伪氨基酸组成及其他特定序列特征的方法
Amino Acids. 2008 Jan;34(1):103-9. doi: 10.1007/s00726-007-0568-2. Epub 2007 Jul 12.
6
iUbiq-Lys: prediction of lysine ubiquitination sites in proteins by extracting sequence evolution information via a gray system model.iUbiq-Lys:通过灰色系统模型提取序列进化信息来预测蛋白质中的赖氨酸泛素化位点。
J Biomol Struct Dyn. 2015;33(8):1731-42. doi: 10.1080/07391102.2014.968875. Epub 2014 Nov 6.
7
Identification of thermophilic proteins by incorporating evolutionary and acid dissociation information into Chou's general pseudo amino acid composition.通过将进化信息和酸解离信息纳入周的广义伪氨基酸组成来鉴定嗜热蛋白。
J Theor Biol. 2016 Oct 21;407:138-142. doi: 10.1016/j.jtbi.2016.07.010. Epub 2016 Jul 7.
8
Identify DNA-binding proteins with optimal Chou's amino acid composition.识别具有最佳周氏氨基酸组成的DNA结合蛋白。
Protein Pept Lett. 2012 Apr;19(4):398-405. doi: 10.2174/092986612799789404.
9
iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition.iHSP-PseRAAAC:利用伪简约氨基酸字母组成鉴定热休克蛋白家族。
Anal Biochem. 2013 Nov 1;442(1):118-25. doi: 10.1016/j.ab.2013.05.024. Epub 2013 Jun 10.
10
iPhos-PseEvo: Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into General PseAAC via Grey System Theory.iPhos-PseEvo:通过灰色系统理论将进化信息纳入通用 PseAAC 来识别人类磷酸化蛋白质。
Mol Inform. 2017 May;36(5-6). doi: 10.1002/minf.201600010. Epub 2016 May 12.

引用本文的文献

1
Deep-WET: a deep learning-based approach for predicting DNA-binding proteins using word embedding techniques with weighted features.深度WET:一种基于深度学习的方法,利用带加权特征的词嵌入技术预测DNA结合蛋白。
Sci Rep. 2024 Feb 5;14(1):2961. doi: 10.1038/s41598-024-52653-9.
2
NRPreTo: A Machine Learning-Based Nuclear Receptor and Subfamily Prediction Tool.NRPreTo:一种基于机器学习的核受体和亚家族预测工具。
ACS Omega. 2023 May 30;8(23):20379-20388. doi: 10.1021/acsomega.3c00286. eCollection 2023 Jun 13.
3
PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel.
PredCRG:一种通过使用带拉普拉斯核的支持向量机识别植物生物钟基因的计算方法。
Plant Methods. 2021 Apr 26;17(1):46. doi: 10.1186/s13007-021-00744-3.
4
Use Chou's 5-Step Rule to Predict DNA-Binding Proteins with Evolutionary Information.利用 Chou 的 5 步法则结合进化信息预测 DNA 结合蛋白。
Biomed Res Int. 2020 Jul 27;2020:6984045. doi: 10.1155/2020/6984045. eCollection 2020.
5
The Helitron family classification using SVM based on Fourier transform features applied on an unbalanced dataset.基于傅里叶变换特征的支持向量机在不平衡数据集上的Helitron 家族分类。
Med Biol Eng Comput. 2019 Oct;57(10):2289-2304. doi: 10.1007/s11517-019-02027-5. Epub 2019 Aug 17.
6
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
7
A Model Stacking Framework for Identifying DNA Binding Proteins by Orchestrating Multi-View Features and Classifiers.一种通过协调多视图特征和分类器来识别DNA结合蛋白的模型堆叠框架。
Genes (Basel). 2018 Aug 1;9(8):394. doi: 10.3390/genes9080394.
8
CNNH_PSS: protein 8-class secondary structure prediction by convolutional neural network with highway.CNN_H_PSS:基于卷积神经网络和高速公路的 8 类蛋白质二级结构预测。
BMC Bioinformatics. 2018 May 8;19(Suppl 4):60. doi: 10.1186/s12859-018-2067-8.
9
HMMBinder: DNA-Binding Protein Prediction Using HMM Profile Based Features.HMMBinder:基于 HMM -profile 特征的 DNA 结合蛋白预测。
Biomed Res Int. 2017;2017:4590609. doi: 10.1155/2017/4590609. Epub 2017 Nov 14.
10
iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features.iDNAProt-ES:利用进化和结构特征鉴定 DNA 结合蛋白。
Sci Rep. 2017 Nov 2;7(1):14938. doi: 10.1038/s41598-017-14945-1.