School of Mathematical Sciences, Nankai University, 300071 Tianjin, China.
Department of Biochemistry, University of Washington, Seattle, WA 98105.
Proc Natl Acad Sci U S A. 2020 Jan 21;117(3):1496-1503. doi: 10.1073/pnas.1914677117. Epub 2020 Jan 2.
The prediction of interresidue contacts and distances from coevolutionary data using deep learning has considerably advanced protein structure prediction. Here, we build on these advances by developing a deep residual network for predicting interresidue orientations, in addition to distances, and a Rosetta-constrained energy-minimization protocol for rapidly and accurately generating structure models guided by these restraints. In benchmark tests on 13th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)- and Continuous Automated Model Evaluation (CAMEO)-derived sets, the method outperforms all previously described structure-prediction methods. Although trained entirely on native proteins, the network consistently assigns higher probability to de novo-designed proteins, identifying the key fold-determining residues and providing an independent quantitative measure of the "ideality" of a protein structure. The method promises to be useful for a broad range of protein structure prediction and design problems.
利用深度学习从共进化数据预测残基间的相互作用和距离,大大提高了蛋白质结构预测的水平。在此基础上,我们通过开发一种深度残差网络来预测残基取向(除了距离),并制定了一个 Rosetta 约束的能量最小化协议,以便在这些约束的指导下快速准确地生成结构模型。在第十三届蛋白质结构预测关键评估技术(CASP13)和连续自动模型评估(CAMEO)衍生数据集的基准测试中,该方法优于所有以前描述的结构预测方法。尽管该网络完全是在天然蛋白质上进行训练,但它始终对从头设计的蛋白质赋予更高的概率,确定关键的折叠决定残基,并提供蛋白质结构“理想度”的独立定量度量。该方法有望在广泛的蛋白质结构预测和设计问题中得到应用。