Pierleoni Andrea, Martelli Pier Luigi, Casadio Rita
Biocomputing Group, Department of Biology, University of Bologna, Via Irnerio 42, 40126 Bologna, Italy.
BMC Bioinformatics. 2008 Sep 23;9:392. doi: 10.1186/1471-2105-9-392.
Several eukaryotic proteins associated to the extracellular leaflet of the plasma membrane carry a Glycosylphosphatidylinositol (GPI) anchor, which is linked to the C-terminal residue after a proteolytic cleavage occurring at the so called omega-site. Computational methods were developed to discriminate proteins that undergo this post-translational modification starting from their aminoacidic sequences. However more accurate methods are needed for a reliable annotation of whole proteomes.
Here we present PredGPI, a prediction method that, by coupling a Hidden Markov Model (HMM) and a Support Vector Machine (SVM), is able to efficiently predict both the presence of the GPI-anchor and the position of the omega-site. PredGPI is trained on a non-redundant dataset of experimentally characterized GPI-anchored proteins whose annotation was carefully checked in the literature.
PredGPI outperforms all the other previously described methods and is able to correctly replicate the results of previously published high-throughput experiments. PredGPI reaches a lower rate of false positive predictions with respect to other available methods and it is therefore a costless, rapid and accurate method for screening whole proteomes.
几种与质膜细胞外小叶相关的真核蛋白质带有糖基磷脂酰肌醇(GPI)锚定,该锚定在所谓的ω位点发生蛋白水解切割后与C末端残基相连。已开发出计算方法,从氨基酸序列开始区分经历这种翻译后修饰的蛋白质。然而,对于整个蛋白质组的可靠注释,需要更精确的方法。
在此,我们展示了PredGPI,这是一种通过结合隐马尔可夫模型(HMM)和支持向量机(SVM),能够有效预测GPI锚定的存在和ω位点位置的预测方法。PredGPI在一个非冗余的实验表征GPI锚定蛋白数据集上进行训练,这些蛋白的注释在文献中经过仔细核对。
PredGPI优于所有先前描述的方法,并且能够正确复制先前发表的高通量实验结果。与其他可用方法相比,PredGPI的假阳性预测率更低,因此是一种用于筛选整个蛋白质组的无成本、快速且准确的方法。