• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于周氏伪氨基酸组成一般形式预测蛋白质溶解度:从混沌博弈表示和分形维数角度进行探讨

Predicting protein solubility by the general form of Chou's pseudo amino acid composition: approached from chaos game representation and fractal dimension.

作者信息

Niu Xiao-Hui, Hu Xue-Hai, Shi Feng, Xia Jing-Bo

机构信息

College of Science, Huazhong, Agricultural University, Wuhan, P.R. of China.

出版信息

Protein Pept Lett. 2012 Sep;19(9):940-8. doi: 10.2174/092986612802084492.

DOI:10.2174/092986612802084492
PMID:22486614
Abstract

Obtaining soluble proteins in sufficient concentrations is a major obstacle in various experimental studies. How to predict the propensity of targets in large-scale proteomics projects to be soluble is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) can investigate the patterns hiding in protein sequences, and can visually reveal previously unknown structure. Fractal dimensions are good tools to measure sizes of complex, highly irregular geometric objects. In this paper, we convert each protein sequence into a high-dimensional vector by CGR algorithm and fractal dimension, and then predict protein solubility by these fractal features together with Chou's pseudo amino acid composition features and support vector machine (SVM). We extract and study six groups of features computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test. As the results of comparisons, the group of 445-dimensional vector gets the best results, the average accuracy is 0.8741 and average MCC is 0.7358. The resulting predictor is also compared with existing methods and shows significant improvement.

摘要

在各种实验研究中,获得足够浓度的可溶性蛋白质是一个主要障碍。如何在大规模蛋白质组学项目中预测靶标蛋白的可溶性倾向是一个重大但尚未得到充分解决的科学问题。混沌游戏表示法(CGR)可以研究隐藏在蛋白质序列中的模式,并能直观地揭示以前未知的结构。分形维数是测量复杂、高度不规则几何物体大小的良好工具。在本文中,我们通过CGR算法和分形维数将每个蛋白质序列转换为高维向量,然后结合周氏伪氨基酸组成特征和支持向量机(SVM),利用这些分形特征预测蛋白质的溶解性。我们提取并研究了直接从一级序列计算得到的六组特征,并通过10倍交叉验证测试对每组特征进行评估。作为比较结果,445维向量组取得了最佳结果,平均准确率为0.8741,平均马修斯相关系数为0.7358。我们还将所得预测器与现有方法进行了比较,结果显示有显著改进。

相似文献

1
Predicting protein solubility by the general form of Chou's pseudo amino acid composition: approached from chaos game representation and fractal dimension.基于周氏伪氨基酸组成一般形式预测蛋白质溶解度:从混沌博弈表示和分形维数角度进行探讨
Protein Pept Lett. 2012 Sep;19(9):940-8. doi: 10.2174/092986612802084492.
2
A new hybrid fractal algorithm for predicting thermophilic nucleotide sequences.一种用于预测嗜热核苷酸序列的新型混合分形算法。
J Theor Biol. 2012 Jan 21;293:74-81. doi: 10.1016/j.jtbi.2011.09.028. Epub 2011 Oct 10.
3
Predicting DNA binding proteins using support vector machine with hybrid fractal features.使用支持向量机和混合分形特征预测 DNA 结合蛋白。
J Theor Biol. 2014 Feb 21;343:186-92. doi: 10.1016/j.jtbi.2013.10.009. Epub 2013 Nov 1.
4
Predicting thermophilic proteins with pseudo amino acid composition:approached from chaos game representation and principal component analysis.基于伪氨基酸组成预测嗜热蛋白:从混沌博弈表示和主成分分析入手
Protein Pept Lett. 2011 Dec;18(12):1244-50. doi: 10.2174/092986611797642661.
5
A novel fractal approach for predicting G-protein-coupled receptors and their subfamilies with support vector machines.一种结合支持向量机的用于预测G蛋白偶联受体及其亚家族的新型分形方法。
Biomed Mater Eng. 2015;26 Suppl 1:S1829-36. doi: 10.3233/BME-151485.
6
Using the concept of Chou's pseudo amino acid composition to predict protein solubility: an approach with entropies in information theory.利用 Chou 的伪氨基酸组成概念预测蛋白质溶解度:一种基于信息理论熵的方法。
J Theor Biol. 2013 Sep 7;332:211-7. doi: 10.1016/j.jtbi.2013.03.010. Epub 2013 Mar 21.
7
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
8
Predicting protein solubility with a hybrid approach by pseudo amino acid composition.基于伪氨基酸组成的混合方法预测蛋白质溶解度。
Protein Pept Lett. 2010 Dec;17(12):1466-72. doi: 10.2174/0929866511009011466.
9
Accurate prediction of nuclear receptors with conjoint triad feature.利用联合三联体特征准确预测核受体。
BMC Bioinformatics. 2015 Dec 3;16:402. doi: 10.1186/s12859-015-0828-1.
10
Structural characterization of chaos game fractals using small-angle scattering analysis.利用小角散射分析对混沌游戏分形进行结构表征
PLoS One. 2017 Jul 13;12(7):e0181385. doi: 10.1371/journal.pone.0181385. eCollection 2017.

引用本文的文献

1
FEPS: A Tool for Feature Extraction from Protein Sequence.FEPS:一种从蛋白质序列中提取特征的工具。
Methods Mol Biol. 2022;2499:65-104. doi: 10.1007/978-1-0716-2317-6_3.
2
Some illuminating remarks on molecular genetics and genomics as well as drug development.关于分子遗传学和基因组学以及药物开发的一些有启发性的观点。
Mol Genet Genomics. 2020 Mar;295(2):261-274. doi: 10.1007/s00438-019-01634-z. Epub 2020 Jan 1.
3
DCGR: feature extractions from protein sequences based on CGR via remodeling multiple information.基于 CGR 利用重塑多种信息对蛋白质序列进行特征提取
BMC Bioinformatics. 2019 Jun 20;20(1):351. doi: 10.1186/s12859-019-2943-x.
4
Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.在蛋白质的物理化学特性层面理解HA亚型分型的潜在机制。
PLoS One. 2014 May 8;9(5):e96984. doi: 10.1371/journal.pone.0096984. eCollection 2014.
5
PseAAC-General: fast building various modes of general form of Chou's pseudo-amino acid composition for large-scale protein datasets.PseAAC-General:快速构建用于大规模蛋白质数据集的周氏伪氨基酸组成通用形式的各种模式。
Int J Mol Sci. 2014 Feb 26;15(3):3495-506. doi: 10.3390/ijms15033495.
6
A multilabel model based on Chou's pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types.基于 Chou 的伪氨基酸组成的多标签模型,用于识别具有单一和多种功能类型的膜蛋白。
J Membr Biol. 2013 Apr;246(4):327-34. doi: 10.1007/s00232-013-9536-9. Epub 2013 Apr 2.
7
iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition.iSNO-PseAAC:通过将位置特异性氨基酸倾向纳入伪氨基酸组成来预测蛋白质中的半胱氨酸 S-亚硝酰化位点。
PLoS One. 2013;8(2):e55844. doi: 10.1371/journal.pone.0055844. Epub 2013 Feb 7.