• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于预测的二级结构预测低同源序列的蛋白质结构类别。

Prediction of protein structural classes for low-homology sequences based on predicted secondary structure.

机构信息

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, Singapore.

出版信息

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-11-S1-S9.

DOI:10.1186/1471-2105-11-S1-S9
PMID:20122246
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3009544/
Abstract

BACKGROUND

Prediction of protein structural classes (alpha, beta, alpha + beta and alpha/beta) from amino acid sequences is of great importance, as it is beneficial to study protein function, regulation and interactions. Many methods have been developed for high-homology protein sequences, and the prediction accuracies can achieve up to 90%. However, for low-homology sequences whose average pairwise sequence identity lies between 20% and 40%, they perform relatively poorly, yielding the prediction accuracy often below 60%.

RESULTS

We propose a new method to predict protein structural classes on the basis of features extracted from the predicted secondary structures of proteins rather than directly from their amino acid sequences. It first uses PSIPRED to predict the secondary structure for each protein sequence. Then, the chaos game representation is employed to represent the predicted secondary structure as two time series, from which we generate a comprehensive set of 24 features using recurrence quantification analysis, K-string based information entropy and segment-based analysis. The resulting feature vectors are finally fed into a simple yet powerful Fisher's discriminant algorithm for the prediction of protein structural classes. We tested the proposed method on three benchmark datasets in low homology and achieved the overall prediction accuracies of 82.9%, 83.1% and 81.3%, respectively. Comparisons with ten existing methods showed that our method consistently performs better for all the tested datasets and the overall accuracy improvements range from 2.3% to 27.5%. A web server that implements the proposed method is freely available at http://www1.spms.ntu.edu.sg/~chenxin/RKS_PPSC/.

CONCLUSION

The high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the predicted secondary structure sequences, which is capable of characterizing the sequence order information, local interactions of the secondary structural elements, and spacial arrangements of alpha helices and beta strands. Thus, it is a valuable method to predict protein structural classes particularly for low-homology amino acid sequences.

摘要

背景

从氨基酸序列预测蛋白质结构类别(α、β、α+β 和 α/β)非常重要,因为这有助于研究蛋白质的功能、调节和相互作用。许多方法已经被开发出来用于同源性高的蛋白质序列,其预测准确率可高达 90%。然而,对于平均序列同一性在 20%到 40%之间的低同源性序列,它们的表现相对较差,预测准确率通常低于 60%。

结果

我们提出了一种新的方法,基于从蛋白质预测的二级结构中提取的特征,而不是直接从氨基酸序列预测蛋白质结构类别。它首先使用 PSIPRED 预测每个蛋白质序列的二级结构。然后,使用混沌游戏表示法将预测的二级结构表示为两个时间序列,从中我们使用递归量化分析、基于 K 串的信息熵和基于片段的分析生成一组综合的 24 个特征。生成的特征向量最后被输入到一个简单而强大的 Fisher 判别算法中,用于预测蛋白质结构类别。我们在三个低同源性基准数据集上测试了所提出的方法,分别获得了 82.9%、83.1%和 81.3%的总体预测准确率。与十种现有方法的比较表明,我们的方法在所有测试数据集上的表现都更好,整体准确率提高了 2.3%到 27.5%。实现所提出方法的 Web 服务器可在 http://www1.spms.ntu.edu.sg/~chenxin/RKS_PPSC/ 免费获得。

结论

我们提出的方法之所以能达到如此高的预测准确率,是因为它设计了一个综合的特征集,用于预测二级结构序列,这些特征集能够描述序列顺序信息、二级结构元素的局部相互作用以及α螺旋和β链的空间排列。因此,这是一种预测蛋白质结构类别的有价值的方法,特别是对于低同源性的氨基酸序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/2e78dbf88f51/1471-2105-11-S1-S9-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/aabeb1f027b5/1471-2105-11-S1-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/31f67d0f149c/1471-2105-11-S1-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/627eccce29f6/1471-2105-11-S1-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/ea411d0aac72/1471-2105-11-S1-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/2e78dbf88f51/1471-2105-11-S1-S9-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/aabeb1f027b5/1471-2105-11-S1-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/31f67d0f149c/1471-2105-11-S1-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/627eccce29f6/1471-2105-11-S1-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/ea411d0aac72/1471-2105-11-S1-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1a6/3009544/2e78dbf88f51/1471-2105-11-S1-S9-5.jpg

相似文献

1
Prediction of protein structural classes for low-homology sequences based on predicted secondary structure.基于预测的二级结构预测低同源序列的蛋白质结构类别。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-11-S1-S9.
2
Accurate prediction of protein structural classes by incorporating predicted secondary structure information into the general form of Chou's pseudo amino acid composition.通过将预测的二级结构信息纳入周的伪氨基酸组成的通用形式,准确预测蛋白质结构类别。
J Theor Biol. 2014 Mar 7;344:12-8. doi: 10.1016/j.jtbi.2013.11.021. Epub 2013 Dec 6.
3
Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.基于混沌游戏表示的递归定量分析预测蛋白质结构类别。
J Theor Biol. 2009 Apr 21;257(4):618-26. doi: 10.1016/j.jtbi.2008.12.027. Epub 2009 Jan 8.
4
Prediction of protein structural class using novel evolutionary collocation-based sequence representation.使用基于新型进化搭配的序列表示法预测蛋白质结构类别。
J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918.
5
SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.SCPRED:对与预测序列具有模糊相似性的序列的蛋白质结构类别进行准确预测。
BMC Bioinformatics. 2008 May 1;9:226. doi: 10.1186/1471-2105-9-226.
6
Incorporating secondary structural features into sequence information for predicting protein structural class.将二级结构特征纳入序列信息以预测蛋白质结构类别。
Protein Pept Lett. 2013 Oct;20(10):1079-87. doi: 10.2174/09298665113209990002.
7
A novel protein structural classes prediction method based on predicted secondary structure.一种基于预测二级结构的新型蛋白质结构类别预测方法。
Biochimie. 2012 May;94(5):1166-71. doi: 10.1016/j.biochi.2012.01.022. Epub 2012 Feb 14.
8
SVM-based method for protein structural class prediction using secondary structural content and structural information of amino acids.基于支持向量机的蛋白质结构类预测方法,该方法利用二级结构含量和氨基酸的结构信息。
J Bioinform Comput Biol. 2011 Aug;9(4):489-502. doi: 10.1142/s0219720011005422.
9
Improving protein structural class prediction using novel combined sequence information and predicted secondary structural features.利用新颖的组合序列信息和预测的二级结构特征提高蛋白质结构类别的预测。
J Comput Chem. 2011 Dec;32(16):3393-8. doi: 10.1002/jcc.21918. Epub 2011 Sep 21.
10
Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences.从与预测序列具有 twilight-zone 身份的序列中预测蛋白质结构类别
BMC Bioinformatics. 2009 Dec 13;10:414. doi: 10.1186/1471-2105-10-414.

引用本文的文献

1
The HER2 target for designing novel multi-peptide vaccine against breast cancer using immunoinformatics and molecular dynamic simulation.利用免疫信息学和分子动力学模拟设计新型抗乳腺癌多肽疫苗的HER2靶点。
Biochem Biophys Rep. 2025 Jul 4;43:102135. doi: 10.1016/j.bbrep.2025.102135. eCollection 2025 Sep.
2
Using Recursive Feature Selection with Random Forest to Improve Protein Structural Class Prediction for Low-Similarity Sequences.使用递归特征选择和随机森林提高低相似度序列的蛋白质结构分类预测。
Comput Math Methods Med. 2021 May 7;2021:5529389. doi: 10.1155/2021/5529389. eCollection 2021.
3
Prediction of protein structural classes by different feature expressions based on 2-D wavelet denoising and fusion.

本文引用的文献

1
Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.基于混沌游戏表示的递归定量分析预测蛋白质结构类别。
J Theor Biol. 2009 Apr 21;257(4):618-26. doi: 10.1016/j.jtbi.2008.12.027. Epub 2009 Jan 8.
2
Prediction of the protein structural class by specific peptide frequencies.通过特定肽段频率预测蛋白质结构类别。
Biochimie. 2009 Feb;91(2):226-9. doi: 10.1016/j.biochi.2008.09.005. Epub 2008 Oct 10.
3
Position-specific residue preference features around the ends of helices and strands and a novel strategy for the prediction of secondary structures.
基于二维小波去噪和融合的不同特征表达预测蛋白质结构类别。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):701. doi: 10.1186/s12859-019-3276-5.
4
Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.基于带间隙二肽和递归特征选择方法的蛋白质结构类预测
Int J Mol Sci. 2015 Dec 24;17(1):15. doi: 10.3390/ijms17010015.
5
Customised fragments libraries for protein structure prediction based on structural class annotations.基于结构类注释的用于蛋白质结构预测的定制片段文库。
BMC Bioinformatics. 2015 Apr 29;16(1):136. doi: 10.1186/s12859-015-0576-2.
6
Proposing a highly accurate protein structural class predictor using segmentation-based features.提出一种基于分段特征的高精度蛋白质结构类预测器。
BMC Genomics. 2014;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2164-15-S1-S2. Epub 2014 Jan 24.
7
Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position.基于内容与位置的预测二级结构统计特征在蛋白质结构类别预测中的比较研究
BMC Bioinformatics. 2013 May 4;14:152. doi: 10.1186/1471-2105-14-152.
8
Accurate prediction of protein structural class.准确预测蛋白质结构类别。
PLoS One. 2012;7(6):e37653. doi: 10.1371/journal.pone.0037653. Epub 2012 Jun 19.
9
A series of PDB related databases for everyday needs.一系列满足日常需求的与蛋白质数据银行(PDB)相关的数据库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D411-9. doi: 10.1093/nar/gkq1105. Epub 2010 Nov 11.
螺旋和链末端周围的特定位置残基偏好特征以及一种预测二级结构的新策略。
Protein Sci. 2008 Sep;17(9):1505-12. doi: 10.1110/ps.035691.108. Epub 2008 Jun 2.
4
SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.SCPRED:对与预测序列具有模糊相似性的序列的蛋白质结构类别进行准确预测。
BMC Bioinformatics. 2008 May 1;9:226. doi: 10.1186/1471-2105-9-226.
5
Predicting protein structural class by SVM with class-wise optimized features and decision probabilities.使用具有类别优化特征和决策概率的支持向量机预测蛋白质结构类别。
J Theor Biol. 2008 Jul 21;253(2):375-80. doi: 10.1016/j.jtbi.2008.02.031. Epub 2008 Mar 4.
6
Prediction of protein structural class using novel evolutionary collocation-based sequence representation.使用基于新型进化搭配的序列表示法预测蛋白质结构类别。
J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918.
7
Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern.基于伪氨基酸组成预测蛋白质结构类别:近似熵与疏水模式
J Theor Biol. 2008 Jan 7;250(1):186-93. doi: 10.1016/j.jtbi.2007.09.014. Epub 2007 Sep 15.
8
Prediction of protein structural class for the twilight zone sequences.对处于模糊界限区域的序列进行蛋白质结构类别的预测。
Biochem Biophys Res Commun. 2007 Jun 1;357(2):453-60. doi: 10.1016/j.bbrc.2007.03.164. Epub 2007 Apr 5.
9
A new representation for protein secondary structure prediction based on frequent patterns.一种基于频繁模式的蛋白质二级结构预测新表示法。
Bioinformatics. 2006 Nov 1;22(21):2628-34. doi: 10.1093/bioinformatics/btl453. Epub 2006 Aug 29.
10
Using pseudo-amino acid composition and support vector machine to predict protein structural class.利用伪氨基酸组成和支持向量机预测蛋白质结构类别。
J Theor Biol. 2006 Dec 7;243(3):444-8. doi: 10.1016/j.jtbi.2006.06.025. Epub 2006 Jul 1.