Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China.
PLoS One. 2011;6(8):e22940. doi: 10.1371/journal.pone.0022940. Epub 2011 Aug 8.
As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism.
作为一种重要的肿瘤抑制蛋白,已在许多人类癌症中发现突变型 p53 的重新激活,而恢复活性 p53 会导致肿瘤消退。在这项工作中,我们开发了一种新的计算方法分别预测单、双、三、四位点 p53 突变体的转录活性。从伪氨基酸组成的一般形式出发,我们使用了 8 种特征来表示突变,然后基于最大相关性、最小冗余和增量特征选择方法选择了最佳预测特征。使用最近邻算法和 Jackknife 交叉验证分别获得的单、双、三、四位点 p53 突变体的马修斯相关系数(MCC)分别为 0.678、0.314、0.705 和 0.907。通过进一步的最优特征集分析表明,2D(二维)结构特征构成了最优特征集的最大部分,并且可能在所有四种类型的 p53 突变体活性状态预测中发挥最重要的作用。通过最优特征集,尤其是最高层的特征集,还表明 3D 结构特征、保守性、突变位点附近氨基酸的理化和生化性质也对 p53 突变体活性状态预测起着相当重要的作用。我们的研究为寻找功能重要的位点以及深入研究 p53 蛋白及其作用机制的相关特征提供了一种新的、有前途的方法。