Suppr
超能文献

基于混沌游戏表示的递归定量分析预测蛋白质结构类别。

Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

机构信息

Division of Mathematical Sciences, School of Physical & Mathematical Sciences, Nanyang Technological University, Singapore 637371.

出版信息

J Theor Biol. 2009 Apr 21;257(4):618-26. doi: 10.1016/j.jtbi.2008.12.027. Epub 2009 Jan 8.

DOI:10.1016/j.jtbi.2008.12.027

PMID:19183559

Abstract

In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.

摘要

在本文中，我们旨在预测低同源性数据集的蛋白质结构类别（α、β、α+β 或 α/β）。我们使用了两个广泛使用的数据集，1189（包含 1092 个蛋白质）和 25PDB（包含 1673 个蛋白质），序列同源性分别为 40%和 25%。我们建议将蛋白质的混沌游戏表示分解为两种时间序列。然后，应用一种新颖而强大的非线性分析技术——递归量化分析（RQA）来分析这些时间序列。对于给定的蛋白质序列，可以用 RQA 计算总共 16 个特征参数，这些参数被视为蛋白质序列的特征表示。基于这种特征表示，使用 Fisher 的线性判别算法预测每个蛋白质的结构类别。Jackknife 测试用于测试和比较我们的方法与其他现有方法。逐步程序的整体准确率分别为 65.8%和 64.2%，用于 1189 和 25PDB 数据集。广泛使用一对一比较程序，我们将我们的方法与其他五种现有方法进行比较。特别是，我们的方法在这两个数据集上的整体准确率分别高出 6.3%和 4.1%。此外，我们的方法仅使用 16 个参数，少于其他方法使用的参数。这表明当前的方法可能对现有方法起到补充作用，并有望进行蛋白质结构类别的预测。

相似文献

Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

J Theor Biol. 2009 Apr 21;257(4):618-26. doi: 10.1016/j.jtbi.2008.12.027. Epub 2009 Jan 8.

Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile.

Biochimie. 2010 Oct;92(10):1330-4. doi: 10.1016/j.biochi.2010.06.013. Epub 2010 Jun 23.

A high-accuracy protein structural class prediction algorithm using predicted secondary structural information.

J Theor Biol. 2010 Dec 7;267(3):272-5. doi: 10.1016/j.jtbi.2010.09.007. Epub 2010 Sep 8.

Classifier ensembles for protein structural class prediction with varying homology.

Biochem Biophys Res Commun. 2006 Sep 29;348(3):981-8. doi: 10.1016/j.bbrc.2006.07.141. Epub 2006 Jul 31.

High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure.

Biochimie. 2011 Apr;93(4):710-4. doi: 10.1016/j.biochi.2011.01.001. Epub 2011 Jan 13.

Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

Amino Acids. 2009 Jul;37(2):415-25. doi: 10.1007/s00726-008-0170-2. Epub 2008 Aug 23.

Using pseudo-amino acid composition and support vector machine to predict protein structural class.

J Theor Biol. 2006 Dec 7;243(3):444-8. doi: 10.1016/j.jtbi.2006.06.025. Epub 2006 Jul 1.

Prediction of protein subcellular localization.

Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.

Prediction of protein structural classes for low-homology sequences based on predicted secondary structure.

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-11-S1-S9.

Prediction of protein structural class using novel evolutionary collocation-based sequence representation.

J Comput Chem. 2008 Jul 30;29(10):1596-604. doi: 10.1002/jcc.20918.

引用本文的文献

Research on Recurrence Plot Feature Quantization Method Based on Image Texture Analysis.

J Environ Public Health. 2022 Aug 8;2022:2495024. doi: 10.1155/2022/2495024. eCollection 2022.

ProtPlat: an efficient pre-training platform for protein classification based on FastText.

BMC Bioinformatics. 2022 Feb 11;23(1):66. doi: 10.1186/s12859-022-04604-2.

Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.

Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.

Identifying anticancer peptides by using a generalized chaos game representation.

J Math Biol. 2019 Jan;78(1-2):441-463. doi: 10.1007/s00285-018-1279-x. Epub 2018 Oct 5.

Prediction of RNA-protein interactions using conjoint triad feature and chaos game representation.

Bioengineered. 2018;9(1):242-251. doi: 10.1080/21655979.2018.1470721.

Detecting transitions in protein dynamics using a recurrence quantification analysis based bootstrap method.

BMC Bioinformatics. 2017 Nov 28;18(1):525. doi: 10.1186/s12859-017-1943-y.

Additive methods for genomic signatures.

BMC Bioinformatics. 2016 Aug 22;17(1):313. doi: 10.1186/s12859-016-1157-8.

Accurate prediction of nuclear receptors with conjoint triad feature.

BMC Bioinformatics. 2015 Dec 3;16:402. doi: 10.1186/s12859-015-0828-1.

A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition.

BMC Bioinformatics. 2015 Mar 5;16:71. doi: 10.1186/s12859-015-0493-4.

PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.

PLoS One. 2014 Mar 27;9(3):e92863. doi: 10.1371/journal.pone.0092863. eCollection 2014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基于混沌游戏表示的递归定量分析预测蛋白质结构类别。

Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译