基于氨基酸组成预测蛋白质序列中的未折叠片段。

Prediction of unfolded segments in a protein sequence based on amino acid composition.

作者信息

Coeytaux Karen, Poupon Anne

机构信息

Yeast Structural Genomics, IBBMC, Université Paris-Sud, Orsay, France.

出版信息

Bioinformatics. 2005 May 1;21(9):1891-900. doi: 10.1093/bioinformatics/bti266. Epub 2005 Jan 18.

DOI:10.1093/bioinformatics/bti266

PMID:15657106

Abstract

MOTIVATION

Partially and wholly unstructured proteins have now been identified in all kingdoms of life--more commonly in eukaryotic organisms. This intrinsic disorder is related to certain critical functions. Apart from their fundamental interest, unstructured regions in proteins may prevent crystallization. Therefore, the prediction of disordered regions is an important aspect for the understanding of protein function, but may also help to devise genetic constructs.

RESULTS

In this paper we present a computational tool for the detection of unstructured regions in proteins based on two properties of unfolded fragments: (1) disordered regions have a biased composition and (2) they usually contain either small or no hydrophobic clusters. In order to quantify these two facts we first calculate the amino acid distributions in structured and unstructured regions. Using this distribution, we calculate for a given sequence fragment the probability to be part of either a structured or an unstructured region. For each amino acid, the distance to the nearest hydrophobic cluster is also computed. Using these three values along a protein sequence allows us to predict unstructured regions, with very simple rules. This method requires only the primary sequence, and no multiple alignment, which makes it an adequate method for orphan proteins.

AVAILABILITY

http://genomics.eu.org/

摘要

动机

目前已在所有生命王国中鉴定出部分和完全无结构的蛋白质，在真核生物中更为常见。这种内在无序与某些关键功能相关。除了其基本的研究意义外，蛋白质中的无结构区域可能会阻止结晶。因此，预测无序区域对于理解蛋白质功能是一个重要方面，而且可能有助于设计基因构建体。

结果

在本文中，我们基于未折叠片段的两个特性提出了一种用于检测蛋白质中无结构区域的计算工具：（1）无序区域具有偏向性的组成，（2）它们通常包含很少或不包含疏水簇。为了量化这两个事实，我们首先计算结构化和无结构区域中的氨基酸分布。利用这种分布，我们为给定的序列片段计算其属于结构化或无结构区域的概率。对于每个氨基酸，还计算其到最近疏水簇的距离。沿着蛋白质序列使用这三个值使我们能够通过非常简单的规则预测无结构区域。该方法仅需要一级序列，无需多序列比对，这使其成为适用于孤儿蛋白的方法。

可用性

http://genomics.eu.org/

相似文献

Prediction of unfolded segments in a protein sequence based on amino acid composition.

Bioinformatics. 2005 May 1;21(9):1891-900. doi: 10.1093/bioinformatics/bti266. Epub 2005 Jan 18.

IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content.

Bioinformatics. 2005 Aug 15;21(16):3433-4. doi: 10.1093/bioinformatics/bti541. Epub 2005 Jun 14.

FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded.

Bioinformatics. 2005 Aug 15;21(16):3435-8. doi: 10.1093/bioinformatics/bti537. Epub 2005 Jun 14.

POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions.

Bioinformatics. 2007 Aug 15;23(16):2046-53. doi: 10.1093/bioinformatics/btm302. Epub 2007 Jun 1.

Using Bayesian multinomial classifier to predict whether a given protein sequence is intrinsically disordered.

J Theor Biol. 2008 Oct 21;254(4):799-803. doi: 10.1016/j.jtbi.2008.05.040. Epub 2008 Jun 14.

Correlation and prediction of gene expression level from amino acid and dipeptide composition of its protein.

BMC Bioinformatics. 2005 Mar 17;6:59. doi: 10.1186/1471-2105-6-59.

Identifying sequence regions undergoing conformational change via predicted continuum secondary structure.

Bioinformatics. 2006 Aug 1;22(15):1809-14. doi: 10.1093/bioinformatics/btl198. Epub 2006 May 23.

BiasViz: visualization of amino acid biased regions in protein alignments.

Bioinformatics. 2007 Nov 15;23(22):3093-4. doi: 10.1093/bioinformatics/btm489. Epub 2007 Oct 6.

Intrinsic disorder prediction from the analysis of multiple protein fold recognition models.

Bioinformatics. 2008 Aug 15;24(16):1798-804. doi: 10.1093/bioinformatics/btn326. Epub 2008 Jun 25.

Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins.

Proteins. 2007 May 15;67(3):695-708. doi: 10.1002/prot.21319.

引用本文的文献

What Can Be Learned by Knowing Only the Amino Acid Composition of Proteins?

Int J Mol Sci. 2024 Dec 21;25(24):13680. doi: 10.3390/ijms252413680.

Using several pseudo amino acid composition types and different machine learning algorithms to classify and predict archaeal phospholipases.

Mol Biol Res Commun. 2023;12(3):117-126. doi: 10.22099/mbrc.2023.47756.1845.

Intrinsically Disordered Proteins: An Overview.

Int J Mol Sci. 2022 Nov 14;23(22):14050. doi: 10.3390/ijms232214050.

Hydropathy Patterning Complements Charge Patterning to Describe Conformational Preferences of Disordered Proteins.

J Phys Chem Lett. 2020 May 7;11(9):3408-3415. doi: 10.1021/acs.jpclett.0c00288. Epub 2020 Apr 17.

Temperature-Controlled Liquid-Liquid Phase Separation of Disordered Proteins.

ACS Cent Sci. 2019 May 22;5(5):821-830. doi: 10.1021/acscentsci.9b00102. Epub 2019 May 1.

Order in Disorder as Observed by the "Hydrophobic Cluster Analysis" of Protein Sequences.

Proteomics. 2018 Nov;18(21-22):e1800054. doi: 10.1002/pmic.201800054. Epub 2018 Oct 30.

DNA repair factor APLF acts as a H2A-H2B histone chaperone through binding its DNA interaction surface.

Nucleic Acids Res. 2018 Aug 21;46(14):7138-7152. doi: 10.1093/nar/gky507.

A novel N-terminal region of the membrane β-hexosyltransferase: its role in secretion of soluble protein by Pichia pastoris.

Microbiology (Reading). 2016 Jan;162(1):23-34. doi: 10.1099/mic.0.000211. Epub 2015 Nov 9.

Accurate Ab Initio and Template-Based Prediction of Short Intrinsically-Disordered Regions by Bidirectional Recurrent Neural Networks Trained on Large-Scale Datasets.

Int J Mol Sci. 2015 Aug 21;16(8):19868-85. doi: 10.3390/ijms160819868.

DisoMCS: Accurately Predicting Protein Intrinsically Disordered Regions Using a Multi-Class Conservative Score Approach.

PLoS One. 2015 Jun 19;10(6):e0128334. doi: 10.1371/journal.pone.0128334. eCollection 2015.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于氨基酸组成预测蛋白质序列中的未折叠片段。

Prediction of unfolded segments in a protein sequence based on amino acid composition.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献