School of Mathematical Sciences, Nankai University, Tianjin, China.
Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China.
Nat Protoc. 2021 Dec;16(12):5634-5651. doi: 10.1038/s41596-021-00628-9. Epub 2021 Nov 10.
The trRosetta (transform-restrained Rosetta) server is a web-based platform for fast and accurate protein structure prediction, powered by deep learning and Rosetta. With the input of a protein's amino acid sequence, a deep neural network is first used to predict the inter-residue geometries, including distance and orientations. The predicted geometries are then transformed as restraints to guide the structure prediction on the basis of direct energy minimization, which is implemented under the framework of Rosetta. The trRosetta server distinguishes itself from other similar structure prediction servers in terms of rapid and accurate de novo structure prediction. As an illustration, trRosetta was applied to two Pfam families with unknown structures, for which the predicted de novo models were estimated to have high accuracy. Nevertheless, to take advantage of homology modeling, homologous templates are used as additional inputs to the network automatically. In general, it takes ~1 h to predict the final structure for a typical protein with ~300 amino acids, using a maximum of 10 CPU cores in parallel in our cluster system. To enable large-scale structure modeling, a downloadable package of trRosetta with open-source codes is available as well. A detailed guidance for using the package is also available in this protocol. The server and the package are available at https://yanglab.nankai.edu.cn/trRosetta/ and https://yanglab.nankai.edu.cn/trRosetta/download/ , respectively.
trRosetta(变换约束的 Rosetta)服务器是一个基于网络的快速、准确的蛋白质结构预测平台,它由深度学习和 Rosetta 驱动。输入蛋白质的氨基酸序列后,首先使用深度神经网络预测残基间的几何形状,包括距离和方向。预测的几何形状然后被转换为约束,以基于直接能量最小化指导结构预测,该最小化是在 Rosetta 的框架下实现的。trRosetta 服务器在快速、准确的从头预测方面与其他类似的结构预测服务器有所区别。作为一个例子,trRosetta 被应用于两个具有未知结构的 Pfam 家族,预测的从头模型被估计具有很高的准确性。然而,为了利用同源建模,同源模板被自动作为网络的附加输入。一般来说,使用我们的集群系统中的最多 10 个 CPU 核并行预测一个具有约 300 个氨基酸的典型蛋白质的最终结构需要约 1 小时。为了实现大规模的结构建模,还提供了一个带有开源代码的 trRosetta 的可下载包。本方案中还提供了使用该包的详细指南。该服务器和包分别可在 https://yanglab.nankai.edu.cn/trRosetta/ 和 https://yanglab.nankai.edu.cn/trRosetta/download/ 获得。