Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China.
PLoS One. 2010 Jun 4;5(6):e10972. doi: 10.1371/journal.pone.0010972.
The metabolic stability is a very important idiosyncracy of proteins that is related to their global flexibility, intramolecular fluctuations, various internal dynamic processes, as well as many marvelous biological functions. Determination of protein's metabolic stability would provide us with useful information for in-depth understanding of the dynamic action mechanisms of proteins. Although several experimental methods have been developed to measure protein's metabolic stability, they are time-consuming and more expensive. Reported in this paper is a computational method, which is featured by (1) integrating various properties of proteins, such as biochemical and physicochemical properties, subcellular locations, network properties and protein complex property, (2) using the mRMR (Maximum Relevance & Minimum Redundancy) principle and the IFS (Incremental Feature Selection) procedure to optimize the prediction engine, and (3) being able to identify proteins among the four types: "short", "medium", "long", and "extra-long" half-life spans. It was revealed through our analysis that the following seven characters played major roles in determining the stability of proteins: (1) KEGG enrichment scores of the protein and its neighbors in network, (2) subcellular locations, (3) polarity, (4) amino acids composition, (5) hydrophobicity, (6) secondary structure propensity, and (7) the number of protein complexes the protein involved. It was observed that there was an intriguing correlation between the predicted metabolic stability of some proteins and the real half-life of the drugs designed to target them. These findings might provide useful insights for designing protein-stability-relevant drugs. The computational method can also be used as a large-scale tool for annotating the metabolic stability for the avalanche of protein sequences generated in the post-genomic age.
蛋白质的代谢稳定性是其非常重要的一个特性,与蛋白质的整体灵活性、分子内波动、各种内部动态过程以及许多奇妙的生物学功能有关。确定蛋白质的代谢稳定性可以为我们深入了解蛋白质的动态作用机制提供有用的信息。尽管已经开发了几种实验方法来测量蛋白质的代谢稳定性,但这些方法既耗时又昂贵。本文报道了一种计算方法,其特点是:(1)整合蛋白质的各种性质,如生化和物理化学性质、亚细胞位置、网络性质和蛋白质复合物性质;(2)使用 mRMR(最大相关性和最小冗余)原理和 IFS(增量特征选择)过程优化预测引擎;(3)能够识别四种半衰期类型的蛋白质:“短”、“中”、“长”和“超长”。通过我们的分析表明,以下七个特征在决定蛋白质稳定性方面起着主要作用:(1)蛋白质及其在网络中的邻居的 KEGG 富集分数;(2)亚细胞位置;(3)极性;(4)氨基酸组成;(5)疏水性;(6)二级结构倾向;(7)蛋白质复合物的数量。观察到一些蛋白质的预测代谢稳定性与针对它们设计的药物的实际半衰期之间存在有趣的相关性。这些发现可能为设计与蛋白质稳定性相关的药物提供有用的见解。该计算方法还可以用作大规模工具,用于注释在后基因组时代产生的大量蛋白质序列的代谢稳定性。