Gao Jianzhao, Cui Wei, Sheng Yajun, Ruan Jishou, Kurgan Lukasz
School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China.
Department of Statistics, University of California Riverside, Riverside, California, United States of America.
PLoS One. 2016 Apr 4;11(4):e0152964. doi: 10.1371/journal.pone.0152964. eCollection 2016.
Ion channels are a class of membrane proteins that attracts a significant amount of basic research, also being potential drug targets. High-throughput identification of these channels is hampered by the low levels of availability of their structures and an observation that use of sequence similarity offers limited predictive quality. Consequently, several machine learning predictors of ion channels from protein sequences that do not rely on high sequence similarity were developed. However, only one of these methods offers a wide scope by predicting ion channels, their types and four major subtypes of the voltage-gated channels. Moreover, this and other existing predictors utilize relatively simple predictive models that limit their accuracy. We propose a novel and accurate predictor of ion channels, their types and the four subtypes of the voltage-gated channels called PSIONplus. Our method combines a support vector machine model and a sequence similarity search with BLAST. The originality of PSIONplus stems from the use of a more sophisticated machine learning model that for the first time in this area utilizes evolutionary profiles and predicted secondary structure, solvent accessibility and intrinsic disorder. We empirically demonstrate that the evolutionary profiles provide the strongest predictive input among new and previously used input types. We also show that all new types of inputs contribute to the prediction. Results on an independent test dataset reveal that PSIONplus obtains relatively good predictive performance and outperforms existing methods. It secures accuracies of 85.4% and 68.3% for the prediction of ion channels and their types, respectively, and the average accuracy of 96.4% for the discrimination of the four ion channel subtypes. Standalone version of PSIONplus is freely available from https://sourceforge.net/projects/psion/.
离子通道是一类膜蛋白,吸引了大量基础研究,同时也是潜在的药物靶点。这些通道的高通量鉴定受到其结构可用性低的阻碍,并且观察发现使用序列相似性提供的预测质量有限。因此,开发了几种不依赖高序列相似性从蛋白质序列预测离子通道的机器学习预测器。然而,这些方法中只有一种通过预测离子通道、其类型以及电压门控通道的四种主要亚型提供了广泛的范围。此外,这种方法和其他现有预测器使用相对简单的预测模型,限制了它们的准确性。我们提出了一种新颖且准确的离子通道、其类型以及电压门控通道的四种亚型的预测器,称为PSIONplus。我们的方法结合了支持向量机模型和使用BLAST的序列相似性搜索。PSIONplus的独特之处在于使用了更复杂的机器学习模型,该模型首次在该领域利用进化谱以及预测的二级结构、溶剂可及性和内在无序性。我们通过实验证明,进化谱在新的和先前使用的输入类型中提供了最强的预测输入。我们还表明,所有新的输入类型都有助于预测。在独立测试数据集上的结果表明,PSIONplus获得了相对较好的预测性能,并且优于现有方法。它在预测离子通道及其类型时的准确率分别为85.4%和68.3%,在区分四种离子通道亚型时的平均准确率为96.4%。PSIONplus的独立版本可从https://sourceforge.net/projects/psion/免费获取。