• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于蛋白质属性预测和伪氨基酸组成的一些说明。

Some remarks on protein attribute prediction and pseudo amino acid composition.

机构信息

Gordon Life Science Institute, 13784 Torrey Del Mar Drive, San Diego, CA 92130, USA.

出版信息

J Theor Biol. 2011 Mar 21;273(1):236-47. doi: 10.1016/j.jtbi.2010.12.024. Epub 2010 Dec 17.

DOI:10.1016/j.jtbi.2010.12.024
PMID:21168420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7125570/
Abstract

With the accomplishment of human genome sequencing, the number of sequence-known proteins has increased explosively. In contrast, the pace is much slower in determining their biological attributes. As a consequence, the gap between sequence-known proteins and attribute-known proteins has become increasingly large. The unbalanced situation, which has critically limited our ability to timely utilize the newly discovered proteins for basic research and drug development, has called for developing computational methods or high-throughput automated tools for fast and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. Actually, during the last two decades or so, many methods in this regard have been established in hope to bridge such a gap. In the course of developing these methods, the following things were often needed to consider: (1) benchmark dataset construction, (2) protein sample formulation, (3) operating algorithm (or engine), (4) anticipated accuracy, and (5) web-server establishment. In this review, we are to discuss each of the five procedures, with a special focus on the introduction of pseudo amino acid composition (PseAAC), its different modes and applications as well as its recent development, particularly in how to use the general formulation of PseAAC to reflect the core and essential features that are deeply hidden in complicated protein sequences.

摘要

随着人类基因组测序的完成,已知序列的蛋白质数量呈爆炸式增长。相比之下,确定其生物学属性的速度要慢得多。因此,已知序列蛋白质和已知属性蛋白质之间的差距越来越大。这种不平衡的情况严重限制了我们及时利用新发现的蛋白质进行基础研究和药物开发的能力,因此需要开发计算方法或高通量自动化工具,以便仅根据序列信息快速可靠地识别未鉴定蛋白质的各种属性。实际上,在过去的二十年左右的时间里,已经建立了许多这方面的方法,希望能够弥合这一差距。在开发这些方法的过程中,通常需要考虑以下几点:(1)基准数据集的构建,(2)蛋白质样本的制定,(3)操作算法(或引擎),(4)预期的准确性,以及(5)网络服务器的建立。在这篇综述中,我们将讨论这五个步骤,特别关注伪氨基酸组成(PseAAC)的介绍,其不同模式和应用以及它的最新发展,特别是如何使用 PseAAC 的通用公式来反映隐藏在复杂蛋白质序列中的核心和基本特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/049a53353b0a/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/069db89523a4/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/b930f884046e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/6fd4a4643841/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/0ee486e19a24/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/049a53353b0a/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/069db89523a4/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/b930f884046e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/6fd4a4643841/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/0ee486e19a24/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec6f/7125570/049a53353b0a/gr5.jpg

相似文献

1
Some remarks on protein attribute prediction and pseudo amino acid composition.关于蛋白质属性预测和伪氨基酸组成的一些说明。
J Theor Biol. 2011 Mar 21;273(1):236-47. doi: 10.1016/j.jtbi.2010.12.024. Epub 2010 Dec 17.
2
iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition.iSNO-PseAAC:通过将位置特异性氨基酸倾向纳入伪氨基酸组成来预测蛋白质中的半胱氨酸 S-亚硝酰化位点。
PLoS One. 2013;8(2):e55844. doi: 10.1371/journal.pone.0055844. Epub 2013 Feb 7.
3
PseAAC-Builder: a cross-platform stand-alone program for generating various special Chou's pseudo-amino acid compositions.PseAAC-Builder:一个跨平台的独立程序,用于生成各种特殊的周的伪氨基酸组成。
Anal Biochem. 2012 Jun 15;425(2):117-9. doi: 10.1016/j.ab.2012.03.015. Epub 2012 Mar 27.
4
iHyd-PseAAC: predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition.iHyd-PseAAC:通过将二肽位置特异性倾向纳入伪氨基酸组成来预测蛋白质中的羟脯氨酸和羟赖氨酸
Int J Mol Sci. 2014 May 5;15(5):7594-610. doi: 10.3390/ijms15057594.
5
iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.iDNA-Prot|dis:通过将氨基酸距离对和简化字母表概况纳入通用伪氨基酸组成来鉴定DNA结合蛋白。
PLoS One. 2014 Sep 3;9(9):e106691. doi: 10.1371/journal.pone.0106691. eCollection 2014.
6
Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition.通过将物理化学性质和静态小波变换纳入伪氨基酸组成来鉴定蛋白质-蛋白质结合位点。
J Biomol Struct Dyn. 2016 Sep;34(9):1946-61. doi: 10.1080/07391102.2015.1095116. Epub 2015 Oct 29.
7
GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.GPCR-2L:通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。
Mol Biosyst. 2011 Mar;7(3):911-9. doi: 10.1039/c0mb00170h. Epub 2010 Dec 23.
8
iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition.iHSP-PseRAAAC:利用伪简约氨基酸字母组成鉴定热休克蛋白家族。
Anal Biochem. 2013 Nov 1;442(1):118-25. doi: 10.1016/j.ab.2013.05.024. Epub 2013 Jun 10.
9
PseAAC-General: fast building various modes of general form of Chou's pseudo-amino acid composition for large-scale protein datasets.PseAAC-General:快速构建用于大规模蛋白质数据集的周氏伪氨基酸组成通用形式的各种模式。
Int J Mol Sci. 2014 Feb 26;15(3):3495-506. doi: 10.3390/ijms15033495.
10
Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types.使用优化的证据理论K近邻分类器和伪氨基酸组成来预测膜蛋白类型。
Biochem Biophys Res Commun. 2005 Aug 19;334(1):288-92. doi: 10.1016/j.bbrc.2005.06.087.

引用本文的文献

1
EnsembleNPPred: A Robust Approach to Neuropeptide Prediction and Recognition Using Ensemble Machine Learning and Deep Learning Methods.集成神经肽预测:一种使用集成机器学习和深度学习方法进行神经肽预测与识别的稳健方法。
Life (Basel). 2025 Jun 25;15(7):1010. doi: 10.3390/life15071010.
2
Enhancing the Feature Representation of Protein Sequence Descriptors in Protein-Protein Interaction Prediction.在蛋白质-蛋白质相互作用预测中增强蛋白质序列描述符的特征表示
Interdiscip Sci. 2025 Jun 2. doi: 10.1007/s12539-025-00723-5.
3
Prediction and validation of nanowire proteins in G20 using machine learning and feature engineering.

本文引用的文献

1
GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions.GPCR-2L:通过两种不同模式的伪氨基酸组成杂交预测G蛋白偶联受体及其类型。
Mol Biosyst. 2011 Mar;7(3):911-9. doi: 10.1039/c0mb00170h. Epub 2010 Dec 23.
2
Predicting protein solubility with a hybrid approach by pseudo amino acid composition.基于伪氨基酸组成的混合方法预测蛋白质溶解度。
Protein Pept Lett. 2010 Dec;17(12):1466-72. doi: 10.2174/0929866511009011466.
3
The structural basis for intramembrane assembly of an activating immunoreceptor complex.
使用机器学习和特征工程对G20中的纳米线蛋白进行预测与验证。
Comput Struct Biotechnol J. 2025 Apr 19;27:1706-1718. doi: 10.1016/j.csbj.2025.04.022. eCollection 2025.
4
GraphATC: advancing multilevel and multi-label anatomical therapeutic chemical classification via atom-level graph learning.GraphATC:通过原子级图学习推进多层次多标签解剖治疗化学分类
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf194.
5
Predicting amyloid proteins using attention-based long short-term memory.使用基于注意力机制的长短期记忆网络预测淀粉样蛋白。
PeerJ Comput Sci. 2025 Feb 7;11:e2660. doi: 10.7717/peerj-cs.2660. eCollection 2025.
6
PSSM-Sumo: deep learning based intelligent model for prediction of sumoylation sites using discriminative features.PSSM-Sumo:基于深度学习的智能模型,用于使用判别特征预测类泛素化位点。
BMC Bioinformatics. 2024 Aug 30;25(1):284. doi: 10.1186/s12859-024-05917-0.
7
Reconstruction of Protein-Protein Interaction Network Based on DGO-SVM Method.基于DGO-SVM方法的蛋白质-蛋白质相互作用网络重建
Curr Issues Mol Biol. 2024 Jul 12;46(7):7353-7372. doi: 10.3390/cimb46070436.
8
deepAMPNet: a novel antimicrobial peptide predictor employing AlphaFold2 predicted structures and a bi-directional long short-term memory protein language model.深度 AMP 网络:一种新颖的抗菌肽预测器,采用 AlphaFold2 预测结构和双向长短期记忆蛋白质语言模型。
PeerJ. 2024 Jul 19;12:e17729. doi: 10.7717/peerj.17729. eCollection 2024.
9
iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model.iProL:基于 Longformer 预训练模型从序列信息中识别 DNA 启动子。
BMC Bioinformatics. 2024 Jun 25;25(1):224. doi: 10.1186/s12859-024-05849-9.
10
Leveraging a meta-learning approach to advance the accuracy of Na blocking peptides prediction.利用元学习方法提高 Na 阻断肽预测的准确性。
Sci Rep. 2024 Feb 23;14(1):4463. doi: 10.1038/s41598-024-55160-z.
激活型免疫受体复合物跨膜组装的结构基础。
Nat Immunol. 2010 Nov;11(11):1023-9. doi: 10.1038/ni.1943. Epub 2010 Oct 3.
4
Solution NMR structure of the V27A drug resistant mutant of influenza A M2 channel.甲型流感 M2 通道 V27A 耐药突变体的溶液 NMR 结构。
Biochem Biophys Res Commun. 2010 Oct 8;401(1):58-63. doi: 10.1016/j.bbrc.2010.09.008. Epub 2010 Sep 15.
5
SecretP: identifying bacterial secreted proteins by fusing new features into Chou's pseudo-amino acid composition.SecretP:通过将新特征融合到 Chou 的伪氨基酸组成中,来鉴定细菌分泌蛋白。
J Theor Biol. 2010 Nov 7;267(1):1-6. doi: 10.1016/j.jtbi.2010.08.001. Epub 2010 Aug 5.
6
Prediction of subcellular location of apoptosis proteins using pseudo amino acid composition: an approach from auto covariance transformation.利用伪氨基酸组成预测凋亡蛋白的亚细胞定位:一种基于自协方差变换的方法
Protein Pept Lett. 2010 Oct;17(10):1263-9. doi: 10.2174/092986610792231528.
7
Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature.通过整合联合三联体特征,利用伪氨基酸组成预测酶亚家族类别。
Protein Pept Lett. 2010 Nov;17(11):1441-9. doi: 10.2174/0929866511009011441.
8
Prediction of apoptosis protein locations with genetic algorithms and support vector machines through a new mode of pseudo amino acid composition.通过一种新的伪氨基酸组成模式,利用遗传算法和支持向量机预测凋亡蛋白的位置。
Protein Pept Lett. 2010 Dec;17(12):1473-9. doi: 10.2174/0929866511009011473.
9
Supersecondary structure prediction using Chou's pseudo amino acid composition.利用周所建立的伪氨基酸组成预测超二级结构。
J Comput Chem. 2011 Jan 30;32(2):271-8. doi: 10.1002/jcc.21616.
10
[Prediction of G-protein-coupled receptor classes with pseudo amino acid composition].基于伪氨基酸组成预测G蛋白偶联受体类别
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2010 Jun;27(3):500-4.