BBK（基于K的分支定界法）：一种可证明的、高效的基于集成的蛋白质设计算法，用于在大序列空间中优化稳定性和结合亲和力。

BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces.

作者信息

Ojewole Adegoke A, Jou Jonathan D, Fowler Vance G, Donald Bruce R

机构信息

1 Department of Computer Science, Duke University , Durham, North Carolina.

2 Computational Biology and Bioinformatics Program, Duke University , Durham, North Carolina.

出版信息

J Comput Biol. 2018 Jul;25(7):726-739. doi: 10.1089/cmb.2017.0267. Epub 2018 Mar 13.

DOI:10.1089/cmb.2017.0267

PMID:29641249

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6074059/

Abstract

Computational protein design (CPD) algorithms that compute binding affinity, K, search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate K for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating K for up to 10-fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein-ligand design problems, BBK* provably approximates K up to 1982-fold faster than the previous state-of-the-art iMinDEE/[Formula: see text]/[Formula: see text] algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.

摘要

计算蛋白质设计（CPD）算法通过计算结合亲和力K来寻找具有能量有利结合自由能的序列。最近的研究表明，有三个原则可以提高CPD的生物学准确性：基于系综的设计、主链和侧链构象的连续灵活性以及关于输入的可证明的准确性保证。然而，之前使用所有这三个设计原则的方法都是单序列（SS）算法，成本非常高：序列数量呈线性，因此对于同时可变残基的数量呈指数级。为了应对这一计算挑战，我们引入了BBK*，一种新的CPD算法，其关键创新在于多序列（MS）界：BBK能高效地为组合数量的序列计算一个可证明的单一上界来近似K，并避免对所有可证明的次优序列进行SS计算。因此，据我们所知，BBK是第一个可证明的、基于系综的CPD算法，其运行时间在序列数量上是次线性的。对204个蛋白质设计问题的计算实验表明，BBK能找到结合最紧密的序列，同时近似K时所需的序列数量比之前的最优算法少多达10倍，而之前的算法需要对序列进行穷举。此外，对于51个蛋白质 - 配体设计问题，BBK可证明地近似K的速度比之前的最优iMinDEE/[公式：见正文]/[公式：见正文]算法快多达1982倍。因此，BBK*不仅加速了之前可证明算法所能实现的蛋白质设计，还能高效地执行对于之前方法来说规模太大的设计。

相似文献

BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces.BBK*（基于K*的分支定界法）：一种可证明的、高效的基于集成的蛋白质设计算法，用于在大序列空间中优化稳定性和结合亲和力。

J Comput Biol. 2018 Jul;25(7):726-739. doi: 10.1089/cmb.2017.0267. Epub 2018 Mar 13.

Minimization-Aware Recursive A Novel, Provable Algorithm that Accelerates Ensemble-Based Protein Design and Provably Approximates the Energy Landscape.最小化感知递归算法——一种新颖的、可证明的算法，可加速基于集合的蛋白质设计并可证明逼近能量景观。

J Comput Biol. 2020 Apr;27(4):550-564. doi: 10.1089/cmb.2019.0315. Epub 2019 Dec 6.

Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface.用于高效基于集成的计算蛋白质设计的新颖、可证明的算法及其在 c-Raf-RBD:KRas 蛋白质-蛋白质界面重新设计中的应用。

PLoS Comput Biol. 2020 Jun 8;16(6):e1007447. doi: 10.1371/journal.pcbi.1007447. eCollection 2020 Jun.

BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.BWM*：一种用于计算蛋白质设计稀疏逼近的新型、可证明的、基于集成的动态规划算法。

J Comput Biol. 2016 Jun;23(6):413-24. doi: 10.1089/cmb.2015.0194. Epub 2016 Jan 8.

Fast gap-free enumeration of conformations and sequences for protein design.用于蛋白质设计的构象和序列的快速无间隙枚举

Proteins. 2015 Oct;83(10):1859-1877. doi: 10.1002/prot.24870. Epub 2015 Aug 24.

comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Protein Design Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence.彗星算法（通过树搜索进行多状态能量的约束优化）：一种用于优化结合亲和力和序列特异性的可证明且高效的蛋白质设计算法。

J Comput Biol. 2016 May;23(5):311-21. doi: 10.1089/cmb.2015.0188. Epub 2016 Jan 13.

Dead-end elimination with perturbations (DEEPer): a provable protein design algorithm with continuous sidechain and backbone flexibility.带有扰动的死胡同消除（DEEPer）：一种具有连续侧链和骨架灵活性的可证明的蛋白质设计算法。

Proteins. 2013 Jan;81(1):18-39. doi: 10.1002/prot.24150. Epub 2012 Sep 18.

Fast search algorithms for computational protein design.用于计算蛋白质设计的快速搜索算法。

J Comput Chem. 2016 May 5;37(12):1048-58. doi: 10.1002/jcc.24290. Epub 2016 Feb 2.

A critical analysis of computational protein design with sparse residue interaction graphs.基于稀疏残基相互作用图的计算蛋白质设计的批判性分析

PLoS Comput Biol. 2017 Mar 30;13(3):e1005346. doi: 10.1371/journal.pcbi.1005346. eCollection 2017 Mar.

CATS (Coordinates of Atoms by Taylor Series): protein design with backbone flexibility in all locally feasible directions.CATS（通过泰勒级数确定原子坐标）：在所有局部可行方向上具有主链灵活性的蛋白质设计。

Bioinformatics. 2017 Jul 15;33(14):i5-i12. doi: 10.1093/bioinformatics/btx277.

引用本文的文献

Evaluation of Physics-Based Protein Design Methods for Predicting Single Residue Effects on Peptide Binding Specificities.基于物理学的蛋白质设计方法对预测单个残基对肽结合特异性影响的评估。

J Comput Chem. 2025 Jun 30;46(17):e70160. doi: 10.1002/jcc.70160.

Protocol for Designing Noncanonical Peptide Binders in OSPREY.OSPREY 中非经典肽配体设计方案

J Comput Biol. 2024 Oct;31(10):965-974. doi: 10.1089/cmb.2024.0669. Epub 2024 Oct 4.

DexDesign: an OSPREY-based algorithm for designing de novo D-peptide inhibitors.DexDesign：一种基于 OSPREY 的从头设计 D-肽抑制剂的算法。

Protein Eng Des Sel. 2024 Jan 29;37. doi: 10.1093/protein/gzae007.

Improved HIV-1 neutralization breadth and potency of V2-apex antibodies by in silico design.通过计算机设计提高 V2-顶点抗体对 HIV-1 的中和广度和效力。

Cell Rep. 2023 Jul 25;42(7):112711. doi: 10.1016/j.celrep.2023.112711. Epub 2023 Jul 11.

Protocol for predicting drug-resistant protein mutations to an ERK2 inhibitor using RESISTOR.使用RESISTOR预测对ERK2抑制剂耐药的蛋白质突变的方案。

STAR Protoc. 2023 Apr 27;4(2):102170. doi: 10.1016/j.xpro.2023.102170.

Resistor: An algorithm for predicting resistance mutations via Pareto optimization over multistate protein design and mutational signatures.电阻器：一种通过多态蛋白质设计和突变特征的 Pareto 优化来预测耐药突变的算法。

Cell Syst. 2022 Oct 19;13(10):830-843.e3. doi: 10.1016/j.cels.2022.09.003.

RESISTOR: A New OSPREY Module to Predict Resistance Mutations.电阻器：一种新的鱼鹰模块，用于预测耐药突变。

J Comput Biol. 2022 Dec;29(12):1346-1352. doi: 10.1089/cmb.2022.0254. Epub 2022 Sep 13.

PLoS Comput Biol. 2020 Jun 8;16(6):e1007447. doi: 10.1371/journal.pcbi.1007447. eCollection 2020 Jun.

Dynamics, a Powerful Component of Current and Future in Silico Approaches for Protein Design and Engineering.动力学：当前及未来计算蛋白质设计和工程方法的强大组成部分。

Int J Mol Sci. 2020 Apr 14;21(8):2713. doi: 10.3390/ijms21082713.

J Comput Biol. 2020 Apr;27(4):550-564. doi: 10.1089/cmb.2019.0315. Epub 2019 Dec 6.

本文引用的文献

OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design.OSPREY利用正向和负向计算蛋白质设计预测抗性突变。

Methods Mol Biol. 2017;1529:291-306. doi: 10.1007/978-1-4939-6637-0_15.

LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid Rotamer-Like Efficiency.LUTE（局部未修剪元组扩展）：使用通用能量函数和类似刚性旋转异构体的效率进行精确且持续灵活的蛋白质设计。

J Comput Biol. 2017 Jun;24(6):536-546. doi: 10.1089/cmb.2016.0136. Epub 2016 Sep 28.

Algorithms for protein design.蛋白质设计算法。

Curr Opin Struct Biol. 2016 Aug;39:16-26. doi: 10.1016/j.sbi.2016.03.006. Epub 2016 Apr 14.

Fast search algorithms for computational protein design.用于计算蛋白质设计的快速搜索算法。

J Comput Chem. 2016 May 5;37(12):1048-58. doi: 10.1002/jcc.24290. Epub 2016 Feb 2.

J Comput Biol. 2016 May;23(5):311-21. doi: 10.1089/cmb.2015.0188. Epub 2016 Jan 13.

J Comput Biol. 2016 Jun;23(6):413-24. doi: 10.1089/cmb.2015.0194. Epub 2016 Jan 8.

Guaranteed Discrete Energy Optimization on Large Protein Design Problems.大型蛋白质设计问题的保证离散能量优化

J Chem Theory Comput. 2015 Dec 8;11(12):5980-9. doi: 10.1021/acs.jctc.5b00594. Epub 2015 Nov 25.

Fast gap-free enumeration of conformations and sequences for protein design.用于蛋白质设计的构象和序列的快速无间隙枚举

Proteins. 2015 Oct;83(10):1859-1877. doi: 10.1002/prot.24870. Epub 2015 Aug 24.

Compact Representation of Continuous Energy Surfaces for More Efficient Protein Design.用于更高效蛋白质设计的连续能量表面的紧凑表示

J Chem Theory Comput. 2015 May 12;11(5):2292-306. doi: 10.1021/ct501031m.

Improved energy bound accuracy enhances the efficiency of continuous protein design.提高能量边界精度可提高连续蛋白质设计的效率。

Proteins. 2015 Jun;83(6):1151-64. doi: 10.1002/prot.24808. Epub 2015 May 8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

BBK*（基于K*的分支定界法）：一种可证明的、高效的基于集成的蛋白质设计算法，用于在大序列空间中优化稳定性和结合亲和力。