Bagos Pantelis G, Liakopoulos Theodore D, Hamodrakas Stavros J
Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens, Panepistimiopolis, Athens 15701, Greece.
BMC Bioinformatics. 2005 Jan 12;6:7. doi: 10.1186/1471-2105-6-7.
Prediction of the transmembrane strands and topology of beta-barrel outer membrane proteins is of interest in current bioinformatics research. Several methods have been applied so far for this task, utilizing different algorithmic techniques and a number of freely available predictors exist. The methods can be grossly divided to those based on Hidden Markov Models (HMMs), on Neural Networks (NNs) and on Support Vector Machines (SVMs). In this work, we compare the different available methods for topology prediction of beta-barrel outer membrane proteins. We evaluate their performance on a non-redundant dataset of 20 beta-barrel outer membrane proteins of gram-negative bacteria, with structures known at atomic resolution. Also, we describe, for the first time, an effective way to combine the individual predictors, at will, to a single consensus prediction method.
We assess the statistical significance of the performance of each prediction scheme and conclude that Hidden Markov Model based methods, HMM-B2TMR, ProfTMB and PRED-TMBB, are currently the best predictors, according to either the per-residue accuracy, the segments overlap measure (SOV) or the total number of proteins with correctly predicted topologies in the test set. Furthermore, we show that the available predictors perform better when only transmembrane beta-barrel domains are used for prediction, rather than the precursor full-length sequences, even though the HMM-based predictors are not influenced significantly. The consensus prediction method performs significantly better than each individual available predictor, since it increases the accuracy up to 4% regarding SOV and up to 15% in correctly predicted topologies.
The consensus prediction method described in this work, optimizes the predicted topology with a dynamic programming algorithm and is implemented in a web-based application freely available to non-commercial users at http://bioinformatics.biol.uoa.gr/ConBBPRED.
β-桶状外膜蛋白跨膜链和拓扑结构的预测是当前生物信息学研究的热点。到目前为止,已经应用了几种方法来完成这项任务,这些方法利用了不同的算法技术,并且有许多免费的预测工具可供使用。这些方法大致可分为基于隐马尔可夫模型(HMM)、神经网络(NN)和支持向量机(SVM)的方法。在这项工作中,我们比较了用于β-桶状外膜蛋白拓扑结构预测的不同现有方法。我们在一个由20种革兰氏阴性菌的β-桶状外膜蛋白组成的非冗余数据集上评估了它们的性能,这些蛋白的结构已知为原子分辨率。此外,我们首次描述了一种将各个预测工具随意组合成单一共识预测方法的有效方法。
我们评估了每种预测方案性能的统计显著性,并得出结论,根据每个残基的准确性、片段重叠度量(SOV)或测试集中拓扑结构预测正确的蛋白质总数,基于隐马尔可夫模型的方法HMM-B2TMR、ProfTMB和PRED-TMBB目前是最好的预测工具。此外,我们表明,当仅使用跨膜β-桶状结构域进行预测时,而不是前体全长序列,现有预测工具的性能更好,尽管基于HMM的预测工具受影响不大。共识预测方法的性能明显优于每个单独的现有预测工具,因为它将SOV的准确性提高了4%,在正确预测的拓扑结构方面提高了15%。
本文描述的共识预测方法使用动态规划算法优化预测的拓扑结构,并在一个基于网络的应用程序中实现,非商业用户可在http://bioinformatics.biol.uoa.gr/ConBBPRED免费使用。