Hallen Mark A, Donald Bruce R
Department of Computer Science, Duke University, Durham, NC, USA.
Toyota Technological Institute at Chicago, Chicago, IL, USA.
Bioinformatics. 2017 Jul 15;33(14):i5-i12. doi: 10.1093/bioinformatics/btx277.
When proteins mutate or bind to ligands, their backbones often move significantly, especially in loop regions. Computational protein design algorithms must model these motions in order to accurately optimize protein stability and binding affinity. However, methods for backbone conformational search in design have been much more limited than for sidechain conformational search. This is especially true for combinatorial protein design algorithms, which aim to search a large sequence space efficiently and thus cannot rely on temporal simulation of each candidate sequence.
We alleviate this difficulty with a new parameterization of backbone conformational space, which represents all degrees of freedom of a specified segment of protein chain that maintain valid bonding geometry (by maintaining the original bond lengths and angles and ω dihedrals). In order to search this space, we present an efficient algorithm, CATS, for computing atomic coordinates as a function of our new continuous backbone internal coordinates. CATS generalizes the iMinDEE and EPIC protein design algorithms, which model continuous flexibility in sidechain dihedrals, to model continuous, appropriately localized flexibility in the backbone dihedrals ϕ and ψ as well. We show using 81 test cases based on 29 different protein structures that CATS finds sequences and conformations that are significantly lower in energy than methods with less or no backbone flexibility do. In particular, we show that CATS can model the viability of an antibody mutation known experimentally to increase affinity, but that appears sterically infeasible when modeled with less or no backbone flexibility.
Our code is available as free software at https://github.com/donaldlab/OSPREY_refactor .
mhallen@ttic.edu or brd+ismb17@cs.duke.edu.
Supplementary data are available at Bioinformatics online.
当蛋白质发生突变或与配体结合时,其主链通常会发生显著移动,尤其是在环区。计算蛋白质设计算法必须对这些运动进行建模,以便准确优化蛋白质稳定性和结合亲和力。然而,设计中主链构象搜索的方法比侧链构象搜索的方法受到更多限制。对于组合蛋白质设计算法来说尤其如此,这类算法旨在高效搜索大的序列空间,因此不能依赖于对每个候选序列进行时间模拟。
我们通过对主链构象空间进行新的参数化来缓解这一困难,该参数化表示蛋白质链特定片段的所有自由度,这些自由度保持有效的键合几何结构(通过保持原始键长、键角和ω二面角)。为了搜索这个空间,我们提出了一种高效算法CATS,用于根据我们新的连续主链内部坐标计算原子坐标。CATS将iMinDEE和EPIC蛋白质设计算法进行了推广,这两种算法对侧链二面角的连续灵活性进行建模,现在也对主链二面角ϕ和ψ的连续、适当局部化的灵活性进行建模。我们使用基于29种不同蛋白质结构的81个测试案例表明,与主链灵活性较低或无主链灵活性的方法相比,CATS找到的序列和构象能量显著更低。特别是,我们表明CATS可以对一种已知能提高亲和力的抗体突变的可行性进行建模,但当用主链灵活性较低或无主链灵活性的方法进行建模时,该突变在空间上似乎是不可行的。
我们的代码可作为免费软件在https://github.com/donaldlab/OSPREY_refactor获取。
mhallen@ttic.edu或brd+ismb17@cs.duke.edu。
补充数据可在《生物信息学》在线获取。