Ferrè F, Clote P
Department of Biology, Boston College, Chestnut Hill, MA 02467, USA.
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W230-2. doi: 10.1093/nar/gki412.
Correctly predicting the disulfide bond topology in a protein is of crucial importance for the understanding of protein function and can be of great help for tertiary prediction methods. The web server http://clavius.bc.edu/~clotelab/DiANNA/ outputs the disulfide connectivity prediction given input of a protein sequence. The following procedure is performed. First, PSIPRED is run to predict the protein's secondary structure, then PSIBLAST is run against the non-redundant SwissProt to obtain a multiple alignment of the input sequence. The predicted secondary structure and the profile arising from this alignment are used in the training phase of our neural network. Next, cysteine oxidation state is predicted, then each pair of cysteines in the protein sequence is assigned a likelihood of forming a disulfide bond--this is performed by means of a novel architecture (diresidue neural network). Finally, Rothberg's implementation of Gabow's maximum weighted matching algorithm is applied to diresidue neural network scores in order to produce the final connectivity prediction. Our novel neural network-based approach achieves results that are comparable and in some cases better than the current state-of-the-art methods.
正确预测蛋白质中的二硫键拓扑结构对于理解蛋白质功能至关重要,并且对三级结构预测方法有很大帮助。网络服务器http://clavius.bc.edu/~clotelab/DiANNA/在输入蛋白质序列后输出二硫键连接性预测结果。具体步骤如下。首先,运行PSIPRED预测蛋白质的二级结构,然后针对非冗余的SwissProt运行PSIBLAST以获得输入序列的多序列比对。预测的二级结构和由此比对产生的轮廓用于我们神经网络的训练阶段。接下来,预测半胱氨酸的氧化状态,然后为蛋白质序列中的每对半胱氨酸分配形成二硫键的可能性——这是通过一种新颖的架构(二残基神经网络)来完成的。最后,将Rothberg对Gabow最大加权匹配算法的实现应用于二残基神经网络得分,以产生最终的连接性预测。我们基于神经网络的新颖方法所取得的结果与当前的先进方法相当,在某些情况下甚至更好。