Roy Choudhury Amrita, Novič Marjana
Laboratory of Chemometrics, National Institute of Chemistry, Ljubljana, Slovenia.
PLoS One. 2015 Dec 22;10(12):e0145564. doi: 10.1371/journal.pone.0145564. eCollection 2015.
Predicting the transmembrane regions is an important aspect of understanding the structures and architecture of different β-barrel membrane proteins. Despite significant efforts, currently available β-transmembrane region predictors are still limited in terms of prediction accuracy, especially in precision. Here, we describe PredβTM, a transmembrane region prediction algorithm for β-barrel proteins. Using amino acid pair frequency information in known β-transmembrane protein sequences, we have trained a support vector machine classifier to predict β-transmembrane segments. Position-specific amino acid preference data is incorporated in the final prediction. The predictor does not incorporate evolutionary profile information explicitly, but is based on sequence patterns generated implicitly by encoding the protein segments using amino acid adjacency matrix. With a benchmark set of 35 β-transmembrane proteins, PredβTM shows a sensitivity and precision of 83.71% and 72.98%, respectively. The segment overlap score is 82.19%. In comparison with other state-of-art methods, PredβTM provides a higher precision and segment overlap without compromising with sensitivity. Further, we applied PredβTM to analyze the β-barrel membrane proteins without defined transmembrane regions and the uncharacterized protein sequences in eight bacterial genomes and predict possible β-transmembrane proteins. PredβTM can be freely accessed on the web at http://transpred.ki.si/.
预测跨膜区域是理解不同β-桶状膜蛋白结构和架构的一个重要方面。尽管付出了巨大努力,但目前可用的β-跨膜区域预测工具在预测准确性方面仍然有限,尤其是在精确性方面。在此,我们描述了PredβTM,一种用于β-桶状蛋白的跨膜区域预测算法。利用已知β-跨膜蛋白序列中的氨基酸对频率信息,我们训练了一个支持向量机分类器来预测β-跨膜片段。最终预测中纳入了位置特异性氨基酸偏好数据。该预测工具没有明确纳入进化谱信息,而是基于使用氨基酸邻接矩阵对蛋白质片段进行编码而隐式生成的序列模式。在一组由35个β-跨膜蛋白组成的基准数据集上,PredβTM的灵敏度和精确性分别为83.71%和72.98%。片段重叠分数为82.19%。与其他现有方法相比,PredβTM在不影响灵敏度的情况下提供了更高的精确性和片段重叠率。此外,我们应用PredβTM分析了未定义跨膜区域的β-桶状膜蛋白以及八个细菌基因组中的未表征蛋白质序列,并预测了可能的β-跨膜蛋白。可通过网页http://transpred.ki.si/免费访问PredβTM。