通过深度神经网络预测蛋白质的预测局部骨架角度和非局部溶剂可及性的误差。

Predicting the errors of predicted local backbone angles and non-local solvent- accessibilities of proteins by deep neural networks.

机构信息

School of Mathematical Sciences and LPMC, Nankai University, Tianjin, People's Republic of China.

Institute for Glycomics and School of Information and Communication Technology, Griffith University, Parklands Dr, Southport, QLD 4222, Australia.

出版信息

Bioinformatics. 2016 Dec 15;32(24):3768-3773. doi: 10.1093/bioinformatics/btw549. Epub 2016 Aug 22.

DOI:10.1093/bioinformatics/btw549

PMID:27551104

Abstract

MOTIVATION

Backbone structures and solvent accessible surface area of proteins are benefited from continuous real value prediction because it removes the arbitrariness of defining boundary between different secondary-structure and solvent-accessibility states. However, lacking the confidence score for predicted values has limited their applications. Here we investigated whether or not we can make a reasonable prediction of absolute errors for predicted backbone torsion angles, Cα-atom-based angles and torsion angles, solvent accessibility, contact numbers and half-sphere exposures by employing deep neural networks.

RESULTS

We found that angle-based errors can be predicted most accurately with Spearman correlation coefficient (SPC) between predicted and actual errors at about 0.6. This is followed by solvent accessibility (SPC∼0.5). The errors on contact-based structural properties are most difficult to predict (SPC between 0.2 and 0.3). We showed that predicted errors are significantly better error indicators than the average errors based on secondary-structure and amino-acid residue types. We further demonstrated the usefulness of predicted errors in model quality assessment. These error or confidence indictors are expected to be useful for prediction, assessment, and refinement of protein structures.

AVAILABILITY AND IMPLEMENTATION

The method is available at http://sparks-lab.org as a part of SPIDER2 package.

CONTACT

yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.auSupplementary information: Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质的骨架结构和溶剂可及表面积受益于连续的实值预测，因为它消除了定义不同二级结构和溶剂可及状态之间边界的任意性。然而，缺乏对预测值的置信度评分限制了它们的应用。在这里，我们研究了是否可以通过使用深度神经网络对预测的骨架扭转角、基于 Cα 原子的角度和扭转角、溶剂可及性、接触数和半球暴露的绝对误差进行合理预测。

结果

我们发现，基于角度的误差可以最准确地预测，预测误差和实际误差之间的 Spearman 相关系数（SPC）约为 0.6。其次是溶剂可及性（SPC∼0.5）。基于接触的结构特性的误差最难预测（SPC 在 0.2 和 0.3 之间）。我们表明，预测误差是比基于二级结构和氨基酸残基类型的平均误差更好的误差指标。我们进一步证明了预测误差在模型质量评估中的有用性。这些误差或置信度指标有望用于蛋白质结构的预测、评估和细化。