• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用支持向量机和 PSI-BLAST 轮廓预测低相似度序列的蛋白质结构类别。

Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile.

机构信息

School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China.

出版信息

Biochimie. 2010 Oct;92(10):1330-4. doi: 10.1016/j.biochi.2010.06.013. Epub 2010 Jun 23.

DOI:10.1016/j.biochi.2010.06.013
PMID:20600567
Abstract

Knowledge of structural class plays an important role in understanding protein folding patterns. In this study, a simple and powerful computational method, which combines support vector machine with PSI-BLAST profile, is proposed to predict protein structural class for low-similarity sequences. The evolution information encoding in the PSI-BLAST profiles is converted into a series of fixed-length feature vectors by extracting amino acid composition and dipeptide composition from the profiles. The resulting vectors are then fed to a support vector machine classifier for the prediction of protein structural class. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence similarity lower than 40% and 25%, respectively. The overall accuracies attain 70.7% and 72.9% for 1189 and 25PDB datasets, respectively. Comparison of our results with other methods shows that our method is very promising to predict protein structural class particularly for low-similarity datasets and may at least play an important complementary role to existing methods.

摘要

结构类别的知识在理解蛋白质折叠模式方面起着重要作用。在这项研究中,提出了一种简单而强大的计算方法,该方法结合支持向量机和 PSI-BLAST 轮廓,用于预测低相似度序列的蛋白质结构类别。通过从轮廓中提取氨基酸组成和二肽组成,将 PSI-BLAST 轮廓中的进化信息编码转换为一系列固定长度的特征向量。然后,将得到的向量输入支持向量机分类器,以预测蛋白质结构类别。为了评估所提出方法的性能,在两个广泛使用的基准数据集 1189(包含 1092 个蛋白质)和 25PDB(包含 1673 个蛋白质)上进行了自举交叉验证测试,序列相似度分别低于 40%和 25%。对于 1189 和 25PDB 数据集,整体准确率分别达到 70.7%和 72.9%。与其他方法的比较表明,我们的方法非常有前途,可以预测蛋白质结构类别,特别是对于低相似度数据集,并且可能至少对现有方法起到重要的补充作用。

相似文献

1
Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile.使用支持向量机和 PSI-BLAST 轮廓预测低相似度序列的蛋白质结构类别。
Biochimie. 2010 Oct;92(10):1330-4. doi: 10.1016/j.biochi.2010.06.013. Epub 2010 Jun 23.
2
High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure.基于预测的二级结构对低相似度序列进行蛋白质结构类别高精度预测。
Biochimie. 2011 Apr;93(4):710-4. doi: 10.1016/j.biochi.2011.01.001. Epub 2011 Jan 13.
3
Using principal component analysis and support vector machine to predict protein structural class for low-similarity sequences via PSSM.基于 PSSM 利用主成分分析和支持向量机预测低相似度序列的蛋白质结构类别
J Biomol Struct Dyn. 2012;29(6):634-42. doi: 10.1080/07391102.2011.672627.
4
Prediction of protein structural class using novel evolutionary collocation-based sequence representation.使用基于新型进化搭配的序列表示法预测蛋白质结构类别。
J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918.
5
Improving the prediction accuracy of protein structural class: approached with alternating word frequency and normalized Lempel-Ziv complexity.提高蛋白质结构类别的预测准确性:采用交替词频和归一化莱姆尔-齐夫复杂度的方法。
J Theor Biol. 2014 Jan 21;341:71-7. doi: 10.1016/j.jtbi.2013.10.002. Epub 2013 Oct 17.
6
Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles.使用 PSI-BLAST -profile 的自协方差变换准确预测蛋白质结构类别。
Amino Acids. 2012 Jun;42(6):2243-9. doi: 10.1007/s00726-011-0964-5. Epub 2011 Jun 23.
7
A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.一种基于机器学习的方法,利用氨基酸组成、顺序和相似性搜索来预测分泌蛋白。
In Silico Biol. 2008;8(2):129-40.
8
A protein structural classes prediction method based on PSI-BLAST profile.一种基于PSI-BLAST序列谱的蛋白质结构类预测方法。
J Theor Biol. 2014 Jul 21;353:19-23. doi: 10.1016/j.jtbi.2014.02.034. Epub 2014 Mar 4.
9
Using pseudo amino acid composition and binary-tree support vector machines to predict protein structural classes.利用伪氨基酸组成和二叉树支持向量机预测蛋白质结构类别。
Amino Acids. 2007 Nov;33(4):623-9. doi: 10.1007/s00726-007-0496-1. Epub 2007 Feb 19.
10
Prediction of protein structure class by coupling improved genetic algorithm and support vector machine.结合改进遗传算法与支持向量机预测蛋白质结构类别
Amino Acids. 2008 Oct;35(3):581-90. doi: 10.1007/s00726-008-0084-z. Epub 2008 Apr 22.

引用本文的文献

1
NFEmbed: modeling nitrogenase activity via classification and regression with pretrained protein embeddings.NFEmbed:通过使用预训练蛋白质嵌入进行分类和回归来模拟固氮酶活性。
Bioinform Adv. 2025 Aug 23;5(1):vbaf204. doi: 10.1093/bioadv/vbaf204. eCollection 2025.
2
Enhancing the Feature Representation of Protein Sequence Descriptors in Protein-Protein Interaction Prediction.在蛋白质-蛋白质相互作用预测中增强蛋白质序列描述符的特征表示
Interdiscip Sci. 2025 Jun 2. doi: 10.1007/s12539-025-00723-5.
3
PLM-ATG: Identification of Autophagy Proteins by Integrating Protein Language Model Embeddings with PSSM-Based Features.
PLM-ATG:通过将蛋白质语言模型嵌入与基于位置特异性得分矩阵的特征相结合来鉴定自噬蛋白
Molecules. 2025 Apr 10;30(8):1704. doi: 10.3390/molecules30081704.
4
A deep learning method to predict bacterial ADP-ribosyltransferase toxins.一种预测细菌 ADP-ribosyltransferase 毒素的深度学习方法。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae378.
5
Comprehensive Research on Druggable Proteins: From PSSM to Pre-Trained Language Models.可成药蛋白的综合研究:从位置特异性得分矩阵到预训练语言模型
Int J Mol Sci. 2024 Apr 19;25(8):4507. doi: 10.3390/ijms25084507.
6
TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection.TeM-DTBA:使用具有套索特征选择的多模态进行高效药物靶点结合亲和力预测。
J Comput Aided Mol Des. 2023 Dec;37(12):573-584. doi: 10.1007/s10822-023-00533-1. Epub 2023 Sep 30.
7
AcrNET: predicting anti-CRISPR with deep learning.AcrNET:基于深度学习的抗 CRISPR 预测。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad259.
8
Stack-VTP: prediction of vesicle transport proteins based on stacked ensemble classifier and evolutionary information.Stack-VTP:基于堆叠集成分类器和进化信息的囊泡转运蛋白预测。
BMC Bioinformatics. 2023 Apr 7;24(1):137. doi: 10.1186/s12859-023-05257-5.
9
HyperVR: a hybrid deep ensemble learning approach for simultaneously predicting virulence factors and antibiotic resistance genes.HyperVR:一种用于同时预测毒力因子和抗生素抗性基因的混合深度集成学习方法。
NAR Genom Bioinform. 2023 Feb 11;5(1):lqad012. doi: 10.1093/nargab/lqad012. eCollection 2023 Mar.
10
PSSMCOOL: a comprehensive R package for generating evolutionary-based descriptors of protein sequences from PSSM profiles.PSSMCOOL:一个用于从PSSM谱生成基于进化的蛋白质序列描述符的综合R包。
Biol Methods Protoc. 2022 Mar 30;7(1):bpac008. doi: 10.1093/biomethods/bpac008. eCollection 2022.