Sun Xing-Yu, Shi Shao-Ping, Qiu Jian-Ding, Suo Sheng-Bao, Huang Shu-Yun, Liang Ru-Ping
Department of Chemistry, Nanchang University, Nanchang 330031, P.R. China.
Mol Biosyst. 2012 Oct 30;8(12):3178-84. doi: 10.1039/c2mb25280e.
In vivo, some proteins exist as monomers and others as oligomers. Oligomers can be further classified into homo-oligomers (formed by identical subunits) and hetero-oligomers (formed by different subunits), and they form the structural components of various biological functions, including cooperative effects, allosteric mechanism and ion-channel gating. Therefore, with the avalanche of protein sequences generated in the post-genomic era, it is very important for both basic research and the pharmaceutical industry to acquire the possible knowledge about quaternary structural attributes of their proteins of interest. In view of this, a high throughput method (DWT_DT), a 2-layer approach by fusing discrete wavelet transform (DWT) and decision-tree algorithm (DT) with physicochemical features, has been developed to predict protein quaternary structures. The 1st layer is to assign a query protein to one of the 10 main quaternary structural attributes. The 2nd layer is to evaluate whether the protein in question is composed of homo- or hetero-oligomers. The overall accuracy by jackknife test for the 1st layer identification was 89.60%. The overall accuracy of the 2nd layer varies from 88.23 to 100%. The results suggest that this newly developed protocol (DWT_DT) is very promising in predicting quaternary structures with complicated composition.
在体内,一些蛋白质以单体形式存在,而另一些则以寡聚体形式存在。寡聚体可进一步分为同型寡聚体(由相同亚基形成)和异型寡聚体(由不同亚基形成),它们构成了各种生物学功能的结构成分,包括协同效应、别构机制和离子通道门控。因此,在后基因组时代产生大量蛋白质序列的情况下,获取有关其感兴趣蛋白质四级结构属性的可能知识,对基础研究和制药行业都非常重要。鉴于此,已开发出一种高通量方法(DWT_DT),即通过将离散小波变换(DWT)和决策树算法(DT)与物理化学特征相融合的两层方法,来预测蛋白质四级结构。第一层是将查询蛋白质分配到10种主要四级结构属性之一。第二层是评估所讨论的蛋白质是由同型寡聚体还是异型寡聚体组成。通过留一法检验,第一层识别的总体准确率为89.60%。第二层的总体准确率在88.23%至100%之间。结果表明,这种新开发的方案(DWT_DT)在预测组成复杂的四级结构方面非常有前景。