Science for Life Laboratory, Stockholm University, Solna SE-171 21, Sweden.
Department of Biochemistry and Biophysics, Stockholm University, Stockholm SE-106 91, Sweden.
Bioinformatics. 2021 Nov 5;37(21):3959-3960. doi: 10.1093/bioinformatics/btab353.
Contact predictions within a protein have recently become a viable method for accurate prediction of protein structure. Using predicted distance distributions has been shown in many cases to be superior to only using a binary contact annotation. Using predicted interprotein distances has also been shown to be able to dock some protein dimers.
Here, we present pyconsFold. Using CNS as its underlying folding mechanism and predicted contact distance it outperforms regular contact prediction-based modeling on our dataset of 210 proteins. It performs marginally worse than the state-of-the-art pyRosetta folding pipeline but is on average about 20 times faster per model. More importantly pyconsFold can also be used as a fold-and-dock protocol by using predicted interprotein contacts/distances to simultaneously fold and dock two protein chains.
pyconsFold is implemented in Python 3 with a strong focus on using as few dependencies as possible for longevity. It is available both as a pip package in Python 3 and as source code on GitHub and is published under the GPLv3 license. The data underlying this article together with source code are available on github, at https://github.com/johnlamb/pyconsfold.
Supplementary data are available at Bioinformatics online.
最近,蛋白质内的接触预测已成为准确预测蛋白质结构的可行方法。在许多情况下,使用预测的距离分布比仅使用二进制接触注释要好。使用预测的蛋白质间距离也已被证明能够对接一些蛋白质二聚体。
在这里,我们展示了 pyconsFold。它使用 CNS 作为其基本折叠机制,并使用预测的接触距离,在我们的 210 个蛋白质数据集上的表现优于基于常规接触预测的建模。它的表现略逊于最新的 pyRosetta 折叠管道,但平均每个模型的速度快 20 倍左右。更重要的是,pyconsFold 还可以通过使用预测的蛋白质间接触/距离来同时折叠和对接两个蛋白质链,用作折叠和对接协议。
pyconsFold 是用 Python 3 实现的,重点是使用尽可能少的依赖项来保证长期使用。它既可以作为 Python 3 的 pip 包使用,也可以作为源代码在 GitHub 上使用,并根据 GPLv3 许可证发布。本文所依据的数据以及源代码都可以在 github 上找到,网址为 https://github.com/johnlamb/pyconsfold。
补充数据可在《生物信息学》在线获取。