使用从细胞自动机图像中提取的纹理描述符对蛋白质进行结构分类。

Structural classification of proteins using texture descriptors extracted from the cellular automata image.

作者信息

Kavianpour Hamidreza, Vasighi Mahdi

机构信息

Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), 45137-66731, Zanjan, Iran.

出版信息

Amino Acids. 2017 Feb;49(2):261-271. doi: 10.1007/s00726-016-2354-5. Epub 2016 Oct 24.

DOI:10.1007/s00726-016-2354-5

PMID:27778167

Abstract

Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.

摘要

如今，了解蛋白质的细胞属性在药学、医学和分子生物学中具有重要作用。这些属性与蛋白质的功能和三维结构密切相关。蛋白质结构类别的知识被用于各种方法，以更好地理解蛋白质的功能和折叠模式。计算方法和智能系统在进行蛋白质结构分类方面可以发挥重要作用。大多数蛋白质序列作为字符和字符串保存在数据库中，而数值表示对于应用机器学习方法至关重要。在这项工作中，基于根据周围疏水性指数简化的氨基酸字母表，引入了蛋白质序列的二进制表示。这些长二进制序列中隐藏的许多重要特征可以通过它们的细胞自动机图像清晰地显示出来。从这些图像中提取的特征用于通过支持向量机构建分类模型。与之前在几个基准数据集上的研究相比，通过十折交叉验证获得的有前景的分类率表明，当前方法有助于揭示隐藏在蛋白质序列中的一些固有特征，并提高预测蛋白质结构类别的质量。

相似文献

Structural classification of proteins using texture descriptors extracted from the cellular automata image.使用从细胞自动机图像中提取的纹理描述符对蛋白质进行结构分类。

Amino Acids. 2017 Feb;49(2):261-271. doi: 10.1007/s00726-016-2354-5. Epub 2016 Oct 24.

Incorporating secondary features into the general form of Chou's PseAAC for predicting protein structural class.将二级特征纳入用于预测蛋白质结构类别的周氏伪氨基酸组成的一般形式中。

Protein Pept Lett. 2012 Nov;19(11):1133-8. doi: 10.2174/092986612803217051.

Cellular automata and its applications in protein bioinformatics.细胞自动机及其在蛋白质生物信息学中的应用。

Curr Protein Pept Sci. 2011 Sep;12(6):508-19. doi: 10.2174/138920311796957720.

Protein classification using texture descriptors extracted from the protein backbone image.基于蛋白质骨架图像提取的纹理描述子进行蛋白质分类。

J Theor Biol. 2010 Jun 7;264(3):1024-32. doi: 10.1016/j.jtbi.2010.03.020. Epub 2010 Mar 20.

Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image.用伪氨基酸组成预测蛋白质结构类别：一种使用细胞自动机图像几何矩的方法。

J Theor Biol. 2008 Oct 7;254(3):691-6. doi: 10.1016/j.jtbi.2008.06.016. Epub 2008 Jun 24.

Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile.使用支持向量机和 PSI-BLAST 轮廓预测低相似度序列的蛋白质结构类别。

Biochimie. 2010 Oct;92(10):1330-4. doi: 10.1016/j.biochi.2010.06.013. Epub 2010 Jun 23.

Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network.基于伪氨基酸组成和支持向量机融合网络预测蛋白质结构类别

Anal Biochem. 2006 Oct 1;357(1):116-21. doi: 10.1016/j.ab.2006.07.022. Epub 2006 Aug 7.

ProFET: Feature engineering captures high-level protein functions.ProFET：特征工程可捕捉高级蛋白质功能。

Bioinformatics. 2015 Nov 1;31(21):3429-36. doi: 10.1093/bioinformatics/btv345. Epub 2015 Jun 30.

Predicting protein structural class by incorporating patterns of over-represented k-mers into the general form of Chou's PseAAC.通过将高丰度k聚体模式纳入周氏伪氨基酸组成的一般形式来预测蛋白质结构类

Protein Pept Lett. 2012 Apr;19(4):388-97. doi: 10.2174/092986612799789350.

Predicting the state of cysteines based on sequence information.基于序列信息预测半胱氨酸状态。

J Theor Biol. 2010 Dec 7;267(3):312-8. doi: 10.1016/j.jtbi.2010.09.002. Epub 2010 Sep 6.

引用本文的文献

New distance measure for comparing protein using cellular automata image.使用细胞自动机图像比较蛋白质的新距离度量。

PLoS One. 2023 Oct 5;18(10):e0287880. doi: 10.1371/journal.pone.0287880. eCollection 2023.

Relating SARS-CoV-2 variants using cellular automata imaging.利用元胞自动机成像技术关联 SARS-CoV-2 变体。

Sci Rep. 2022 Jun 18;12(1):10297. doi: 10.1038/s41598-022-14404-6.

Computational Modeling of Proteins based on Cellular Automata: A Method of HP Folding Approximation.基于元胞自动机的蛋白质计算建模：HP 折叠逼近方法。

Protein J. 2018 Jun;37(3):248-260. doi: 10.1007/s10930-018-9771-0.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用从细胞自动机图像中提取的纹理描述符对蛋白质进行结构分类。

Structural classification of proteins using texture descriptors extracted from the cellular automata image.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献