Yang Jing, Shen Hong-Bin
Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China.
Bioinformatics. 2018 Jan 15;34(2):230-238. doi: 10.1093/bioinformatics/btx593.
Inter-residue contacts in proteins have been widely acknowledged to be valuable for protein 3 D structure prediction. Accurate prediction of long-range transmembrane inter-helix residue contacts can significantly improve the quality of simulated membrane protein models.
In this paper, we present an updated MemBrain predictor, which aims to predict transmembrane protein residue contacts. Our new model benefits from an efficient learning algorithm that can mine latent structural features, which exist in original feature space. The new MemBrain is a two-stage inter-helix contact predictor. The first stage takes sequence-based features as inputs and outputs coarse contact probabilities for each residue pair, which will be further fed into convolutional neural network together with predictions from three direct-coupling analysis approaches in the second stage. Experimental results on the training dataset show that our method achieves an average accuracy of 81.6% for the top L/5 predictions using a strict sequence-based jackknife cross-validation. Evaluated on the test dataset, MemBrain can achieve 79.4% prediction accuracy. Moreover, for the top L/5 predicted long-range loop contacts, the prediction performance can reach an accuracy of 56.4%. These results demonstrate that the new MemBrain is promising for transmembrane protein's contact map prediction.
http://www.csbio.sjtu.edu.cn/bioinf/MemBrain/.
Supplementary data are available at Bioinformatics online.
蛋白质中残基间的接触对于蛋白质三维结构预测具有重要价值,这一点已得到广泛认可。准确预测跨膜螺旋间的长程残基接触能够显著提升模拟膜蛋白模型的质量。
在本文中,我们展示了一个更新后的MemBrain预测器,其旨在预测跨膜蛋白的残基接触。我们的新模型受益于一种高效的学习算法,该算法能够挖掘原始特征空间中存在的潜在结构特征。新的MemBrain是一个两阶段的螺旋间接触预测器。第一阶段将基于序列的特征作为输入,并输出每个残基对的粗略接触概率,这些概率将在第二阶段与来自三种直接耦合分析方法的预测结果一起被输入到卷积神经网络中。在训练数据集上的实验结果表明,使用基于序列的严格留一法交叉验证,我们的方法在 top L/5 预测上平均准确率达到了81.6%。在测试数据集上进行评估时,MemBrain能够达到79.4%的预测准确率。此外,对于 top L/5 预测的长程环接触,预测性能能够达到56.4%的准确率。这些结果表明新的MemBrain在跨膜蛋白接触图预测方面具有良好前景。
http://www.csbio.sjtu.edu.cn/bioinf/MemBrain/。
补充数据可在《生物信息学》在线获取。