Emmes Corporation, Rockville, Washington, MD, USA.
Department of Biology, Rensselaer Polytechnic Institute, Troy, NY, USA.
BMC Bioinformatics. 2018 Sep 24;19(1):337. doi: 10.1186/s12859-018-2345-5.
With increasing interest in ab initio protein design, there is a desire to be able to fully explore the design space of insertions and deletions. Nature inserts and deletes residues to optimize energy and function, but allowing variable length indels in the context of an interactive protein design session presents challenges with regard to speed and accuracy.
Here we present a new module (INDEL) for InteractiveRosetta which allows the user to specify a range of lengths for a desired indel, and which returns a set of low energy backbones in a matter of seconds. To make the loop search fast, loop anchor points are geometrically hashed using C α-C α and C β-C β distances, and the hash is mapped to start and end points in a pre-compiled random access file of non-redundant, protein backbone coordinates. Loops with superposable anchors are filtered for collisions and returned to InteractiveRosetta as poly-alanine for display and selective incorporation into the design template. Sidechains can then be added using RosettaDesign tools.
INDEL was able to find viable loops in 100% of 500 attempts for all lengths from 3 to 20 residues. INDEL has been applied to the task of designing a domain-swapping loop for T7-endonuclease I, changing its specificity from Holliday junctions to paranemic crossover (PX) DNA.
随着人们对从头蛋白质设计的兴趣不断增加,人们希望能够充分探索插入和缺失的设计空间。自然界会插入和删除残基以优化能量和功能,但在交互式蛋白质设计会话中允许可变长度的插入和缺失会在速度和准确性方面带来挑战。
在这里,我们为 InteractiveRosetta 呈现了一个新模块(INDEL),该模块允许用户为所需的插入和缺失指定一个长度范围,并在几秒钟内返回一组低能量的骨架。为了使循环搜索快速,循环锚点使用 Cα-Cα 和 Cβ-Cβ 距离进行几何哈希处理,并且哈希值映射到预编译的非冗余、蛋白质骨架坐标的随机访问文件中的起始和结束点。具有可重叠锚点的循环会进行碰撞过滤,并作为聚丙氨酸返回给 InteractiveRosetta 进行显示,并选择性地纳入设计模板中。然后可以使用 RosettaDesign 工具添加侧链。
INDEL 能够在所有 3 到 20 个残基的所有长度的 500 次尝试中找到可行的循环。INDEL 已应用于设计 T7 内切酶 I 的结构域交换环的任务,将其特异性从 Holliday 结改变为旁切交叉 (PX) DNA。