Signal Processing Laboratory, Griffith University, Nathan, Australia.
Institute for Glycomics, Griffith University, Gold Coast, Australia.
Proteins. 2018 Jun;86(6):629-633. doi: 10.1002/prot.25489. Epub 2018 Mar 25.
Designing protein sequences that can fold into a given structure is a well-known inverse protein-folding problem. One important characteristic to attain for a protein design program is the ability to recover wild-type sequences given their native backbone structures. The highest average sequence identity accuracy achieved by current protein-design programs in this problem is around 30%, achieved by our previous system, SPIN. SPIN is a program that predicts sequences compatible with a provided structure using a neural network with fragment-based local and energy-based nonlocal profiles. Our new model, SPIN2, uses a deep neural network and additional structural features to improve on SPIN. SPIN2 achieves over 34% in sequence recovery in 10-fold cross-validation and independent tests, a 4% improvement over the previous version. The sequence profiles generated from SPIN2 are expected to be useful for improving existing fold recognition and protein design techniques. SPIN2 is available at http://sparks-lab.org.
设计能够折叠成给定结构的蛋白质序列是一个众所周知的蛋白质折叠反问题。蛋白质设计程序的一个重要特征是能够根据其天然骨架结构恢复野生型序列。在这个问题上,目前的蛋白质设计程序的最高平均序列同一性准确性约为 30%,这是由我们之前的系统 SPIN 实现的。SPIN 是一个使用基于片段的局部和基于能量的非局部轮廓的神经网络来预测与提供结构兼容的序列的程序。我们的新模型 SPIN2 使用深度神经网络和其他结构特征来改进 SPIN。SPIN2 在 10 倍交叉验证和独立测试中的序列恢复准确率超过 34%,比上一版本提高了 4%。预计 SPIN2 生成的序列轮廓将有助于改进现有的折叠识别和蛋白质设计技术。SPIN2 可在 http://sparks-lab.org 获得。