BWM*：一种用于计算蛋白质设计稀疏逼近的新型、可证明的、基于集成的动态规划算法。

BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.

作者信息

Jou Jonathan D, Jain Swati, Georgiev Ivelin S, Donald Bruce R

机构信息

1 Department of Computer Science, Duke University , Durham, North Carolina.

2 Department of Biochemistry, Duke University Medical Center , Durham, North Carolina.

出版信息

J Comput Biol. 2016 Jun;23(6):413-24. doi: 10.1089/cmb.2015.0194. Epub 2016 Jan 8.

DOI:10.1089/cmb.2015.0194

PMID:26744898

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4904165/

Abstract

Sparse energy functions that ignore long range interactions between residue pairs are frequently used by protein design algorithms to reduce computational cost. Current dynamic programming algorithms that fully exploit the optimal substructure produced by these energy functions only compute the GMEC. This disproportionately favors the sequence of a single, static conformation and overlooks better binding sequences with multiple low-energy conformations. Provable, ensemble-based algorithms such as A* avoid this problem, but A* cannot guarantee better performance than exhaustive enumeration. We propose a novel, provable, dynamic programming algorithm called Branch-Width Minimization* (BWM*) to enumerate a gap-free ensemble of conformations in order of increasing energy. Given a branch-decomposition of branch-width w for an n-residue protein design with at most q discrete side-chain conformations per residue, BWM* returns the sparse GMEC in O([Formula: see text]) time and enumerates each additional conformation in merely O([Formula: see text]) time. We define a new measure, Total Effective Search Space (TESS), which can be computed efficiently a priori before BWM* or A* is run. We ran BWM* on 67 protein design problems and found that TESS discriminated between BWM*-efficient and A*-efficient cases with 100% accuracy. As predicted by TESS and validated experimentally, BWM* outperforms A* in 73% of the cases and computes the full ensemble or a close approximation faster than A*, enumerating each additional conformation in milliseconds. Unlike A*, the performance of BWM* can be predicted in polynomial time before running the algorithm, which gives protein designers the power to choose the most efficient algorithm for their particular design problem.

摘要

蛋白质设计算法经常使用忽略残基对之间长程相互作用的稀疏能量函数，以降低计算成本。当前充分利用这些能量函数产生的最优子结构的动态规划算法仅计算全局最小能量构象（GMEC）。这过度偏向于单一静态构象的序列，而忽略了具有多个低能量构象的更好的结合序列。诸如A等基于可证明的整体算法避免了这个问题，但A不能保证比穷举枚举有更好的性能。我们提出了一种新颖的、可证明的动态规划算法，称为分支宽度最小化*（BWM*），以按能量增加的顺序枚举无间隙的构象整体。对于每个残基最多有q个离散侧链构象的n残基蛋白质设计，给定分支宽度为w的分支分解，BWM在O([公式：见原文])时间内返回稀疏GMEC，并且仅在O([公式：见原文])时间内枚举每个额外的构象。我们定义了一种新的度量，总有效搜索空间（TESS），它可以在运行BWM或A之前有效地先验计算。我们在67个蛋白质设计问题上运行了BWM，发现TESS以100%的准确率区分了BWM高效和A高效的情况。正如TESS预测并经实验验证的那样，BWM在73%的情况下优于A，并且比A更快地计算完整的整体或近似值，以毫秒为单位枚举每个额外的构象。与A不同，BWM*的性能可以在运行算法之前在多项式时间内预测，这使蛋白质设计师能够为他们特定的设计问题选择最有效的算法。

相似文献

BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.BWM*：一种用于计算蛋白质设计稀疏逼近的新型、可证明的、基于集成的动态规划算法。

J Comput Biol. 2016 Jun;23(6):413-24. doi: 10.1089/cmb.2015.0194. Epub 2016 Jan 8.

A critical analysis of computational protein design with sparse residue interaction graphs.基于稀疏残基相互作用图的计算蛋白质设计的批判性分析

PLoS Comput Biol. 2017 Mar 30;13(3):e1005346. doi: 10.1371/journal.pcbi.1005346. eCollection 2017 Mar.

BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces.BBK*（基于K*的分支定界法）：一种可证明的、高效的基于集成的蛋白质设计算法，用于在大序列空间中优化稳定性和结合亲和力。

J Comput Biol. 2018 Jul;25(7):726-739. doi: 10.1089/cmb.2017.0267. Epub 2018 Mar 13.

Minimization-Aware Recursive A Novel, Provable Algorithm that Accelerates Ensemble-Based Protein Design and Provably Approximates the Energy Landscape.最小化感知递归算法——一种新颖的、可证明的算法，可加速基于集合的蛋白质设计并可证明逼近能量景观。

J Comput Biol. 2020 Apr;27(4):550-564. doi: 10.1089/cmb.2019.0315. Epub 2019 Dec 6.

Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface.用于高效基于集成的计算蛋白质设计的新颖、可证明的算法及其在 c-Raf-RBD:KRas 蛋白质-蛋白质界面重新设计中的应用。

PLoS Comput Biol. 2020 Jun 8;16(6):e1007447. doi: 10.1371/journal.pcbi.1007447. eCollection 2020 Jun.

Computational Protein Design Using AND/OR Branch-and-Bound Search.使用与/或分支定界搜索的计算蛋白质设计

J Comput Biol. 2016 Jun;23(6):439-51. doi: 10.1089/cmb.2015.0212. Epub 2016 May 11.

Fast gap-free enumeration of conformations and sequences for protein design.用于蛋白质设计的构象和序列的快速无间隙枚举

Proteins. 2015 Oct;83(10):1859-1877. doi: 10.1002/prot.24870. Epub 2015 Aug 24.

The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles.最小化死端消除标准及其在用于计算分子系综配分函数的混合评分与搜索算法中对蛋白质重新设计的应用。

J Comput Chem. 2008 Jul 30;29(10):1527-42. doi: 10.1002/jcc.20909.

An algebraic geometry approach to protein structure determination from NMR data.一种基于核磁共振数据确定蛋白质结构的代数几何方法。

Proc IEEE Comput Syst Bioinform Conf. 2005:235-46. doi: 10.1109/csb.2005.11.

Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space.原子水平蛋白质设计的准确预测及其在扩展近最优序列空间中的应用。

Proteins. 2009 May 15;75(3):682-705. doi: 10.1002/prot.22280.

引用本文的文献

Protocol for Designing Noncanonical Peptide Binders in OSPREY.OSPREY 中非经典肽配体设计方案

J Comput Biol. 2024 Oct;31(10):965-974. doi: 10.1089/cmb.2024.0669. Epub 2024 Oct 4.

Resistor: An algorithm for predicting resistance mutations via Pareto optimization over multistate protein design and mutational signatures.电阻器：一种通过多态蛋白质设计和突变特征的 Pareto 优化来预测耐药突变的算法。

Cell Syst. 2022 Oct 19;13(10):830-843.e3. doi: 10.1016/j.cels.2022.09.003.

RESISTOR: A New OSPREY Module to Predict Resistance Mutations.电阻器：一种新的鱼鹰模块，用于预测耐药突变。

J Comput Biol. 2022 Dec;29(12):1346-1352. doi: 10.1089/cmb.2022.0254. Epub 2022 Sep 13.

Protein Design by Provable Algorithms.基于可证明算法的蛋白质设计

Commun ACM. 2019 Oct;62(10):76-84. doi: 10.1145/3338124.

OSPREY 3.0: Open-source protein redesign for you, with powerful new features.OSPREY 3.0：开源蛋白质设计软件，拥有强大的新功能。

J Comput Chem. 2018 Nov 15;39(30):2494-2507. doi: 10.1002/jcc.25522. Epub 2018 Oct 14.

J Comput Biol. 2018 Jul;25(7):726-739. doi: 10.1089/cmb.2017.0267. Epub 2018 Mar 13.

CATS (Coordinates of Atoms by Taylor Series): protein design with backbone flexibility in all locally feasible directions.CATS（通过泰勒级数确定原子坐标）：在所有局部可行方向上具有主链灵活性的蛋白质设计。

Bioinformatics. 2017 Jul 15;33(14):i5-i12. doi: 10.1093/bioinformatics/btx277.

A critical analysis of computational protein design with sparse residue interaction graphs.基于稀疏残基相互作用图的计算蛋白质设计的批判性分析

PLoS Comput Biol. 2017 Mar 30;13(3):e1005346. doi: 10.1371/journal.pcbi.1005346. eCollection 2017 Mar.

OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design.OSPREY利用正向和负向计算蛋白质设计预测抗性突变。

Methods Mol Biol. 2017;1529:291-306. doi: 10.1007/978-1-4939-6637-0_15.

LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid Rotamer-Like Efficiency.LUTE（局部未修剪元组扩展）：使用通用能量函数和类似刚性旋转异构体的效率进行精确且持续灵活的蛋白质设计。

J Comput Biol. 2017 Jun;24(6):536-546. doi: 10.1089/cmb.2016.0136. Epub 2016 Sep 28.

本文引用的文献

Protein design algorithms predict viable resistance to an experimental antifolate.蛋白质设计算法预测了对一种实验性抗叶酸药物的可行抗性。

Proc Natl Acad Sci U S A. 2015 Jan 20;112(3):749-54. doi: 10.1073/pnas.1411548112. Epub 2014 Dec 31.

Enhanced potency of a broadly neutralizing HIV-1 antibody in vitro improves protection against lentiviral infection in vivo.一种广泛中和HIV-1抗体在体外增强的效力可提高体内抗慢病毒感染的保护作用。

J Virol. 2014 Nov;88(21):12669-82. doi: 10.1128/JVI.02213-14. Epub 2014 Aug 20.

De novo design and experimental characterization of ultrashort self-associating peptides.从头设计和超短自组装肽的实验表征。

PLoS Comput Biol. 2014 Jul 10;10(7):e1003718. doi: 10.1371/journal.pcbi.1003718. eCollection 2014 Jul.

Removing T-cell epitopes with computational protein design.利用计算蛋白质设计去除 T 细胞表位。

Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):8577-82. doi: 10.1073/pnas.1321126111. Epub 2014 May 19.

Antibodies VRC01 and 10E8 neutralize HIV-1 with high breadth and potency even with Ig-framework regions substantially reverted to germline.抗体 VRC01 和 10E8 即使在 Ig 框架区大量回复为原始序列的情况下，仍具有广泛和高效的中和 HIV-1 的能力。

J Immunol. 2014 Feb 1;192(3):1100-1106. doi: 10.4049/jimmunol.1302515. Epub 2014 Jan 3.

Efficient Computation of Small-Molecule Configurational Binding Entropy and Free Energy Changes by Ensemble Enumeration.通过系综枚举高效计算小分子构型结合熵和自由能变化

J Chem Theory Comput. 2013 Nov 12;9(11):5098-5115. doi: 10.1021/ct400383v. Epub 2013 Aug 7.

Replica exchange improves sampling in low-resolution docking stage of RosettaDock.复制交换提高了 RosettaDock 低分辨率对接阶段的采样效率。

PLoS One. 2013 Aug 29;8(8):e72096. doi: 10.1371/journal.pone.0072096. eCollection 2013.

OSPREY: protein design with ensembles, flexibility, and provable algorithms.鱼鹰：具有集成、灵活性和可验证算法的蛋白质设计

Methods Enzymol. 2013;523:87-107. doi: 10.1016/B978-0-12-394292-0.00005-9.

Rapid calculation of protein pKa values using Rosetta.利用 Rosetta 快速计算蛋白质 pKa 值。

Biophys J. 2012 Aug 8;103(3):587-595. doi: 10.1016/j.bpj.2012.06.044.

Dead-end elimination with perturbations (DEEPer): a provable protein design algorithm with continuous sidechain and backbone flexibility.带有扰动的死胡同消除（DEEPer）：一种具有连续侧链和骨架灵活性的可证明的蛋白质设计算法。

Proteins. 2013 Jan;81(1):18-39. doi: 10.1002/prot.24150. Epub 2012 Sep 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验