量化连续序列空间中蛋白质设计的多目标启发式算法

Multiobjective heuristic algorithm for protein design in a quantified continuous sequence space.

作者信息

Li Rui-Xiang, Zhang Ning-Ning, Wu Bin, OuYang Bo, Shen Hong-Bin

机构信息

Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.

State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 201203, China.

出版信息

Comput Struct Biotechnol J. 2021 Apr 25;19:2575-2587. doi: 10.1016/j.csbj.2021.04.046. eCollection 2021.

DOI:10.1016/j.csbj.2021.04.046

PMID:34025944

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8114120/

Abstract

Protein design usually involves sequence search process and evaluation criteria. Commonly used methods primarily implement the Monte Carlo or simulated annealing algorithm with a single-energy function to obtain ideal solutions, which is often highly time-consuming and limited by the accuracy of the energy function. In this report, we introduce a multiobjective algorithm named Hydra for protein design, which employs two different energy functions to optimize solutions simultaneously and makes use of the latent quantitative relationship between different amino acid types to facilitate the search process. The framework uses two kinds of prior information to transform the original disordered discrete sequence space into a relatively ordered space, and decoy sequences are searched in this ordered space through a multiobjective swarm intelligence algorithm. This algorithm features high accuracy and a high-speed search process. Our method was tested on 40 targets covering different fold classes, which were computationally verified to be well folded, and it experimentally solved the 1UBQ fold by NMR in excellent agreement with the native structure with a backbone RMSD deviation of 1.074 Å. The Hydra software package can be downloaded from: http://www.csbio.sjtu.edu.cn/bioinf/HYDRA/ for academic use.

摘要

蛋白质设计通常涉及序列搜索过程和评估标准。常用方法主要通过单能量函数实现蒙特卡罗或模拟退火算法以获得理想解，这通常非常耗时且受能量函数准确性的限制。在本报告中，我们介绍了一种名为Hydra的用于蛋白质设计的多目标算法，该算法采用两种不同的能量函数同时优化解，并利用不同氨基酸类型之间的潜在定量关系来促进搜索过程。该框架使用两种先验信息将原始无序的离散序列空间转换为相对有序的空间，并通过多目标群体智能算法在这个有序空间中搜索诱饵序列。该算法具有高精度和高速搜索过程的特点。我们的方法在涵盖不同折叠类别的40个靶标上进行了测试，这些靶标经计算验证能正确折叠，并且通过核磁共振实验解析了1UBQ折叠，与天然结构的一致性极佳，主链均方根偏差为1.074 Å。Hydra软件包可从以下网址下载以供学术使用：http://www.csbio.sjtu.edu.cn/bioinf/HYDRA/ 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9c3/8114120/cb210df99c5c/ga1.jpg

相似文献

Multiobjective heuristic algorithm for protein design in a quantified continuous sequence space.量化连续序列空间中蛋白质设计的多目标启发式算法

Comput Struct Biotechnol J. 2021 Apr 25;19:2575-2587. doi: 10.1016/j.csbj.2021.04.046. eCollection 2021.

Artificial intelligence-based multi-objective optimization protocol for protein structure refinement.基于人工智能的蛋白质结构精修多目标优化协议。

Bioinformatics. 2020 Jan 15;36(2):437-448. doi: 10.1093/bioinformatics/btz544.

Heuristic-based tabu search algorithm for folding two-dimensional AB off-lattice model proteins.基于启发式的禁忌搜索算法用于折叠二维 AB 无格模型蛋白质。

Comput Biol Chem. 2013 Dec;47:142-8. doi: 10.1016/j.compbiolchem.2013.08.011. Epub 2013 Sep 8.

A sequence space search engine for computational protein design to modulate molecular functionality.一种用于计算蛋白质设计的序列空间搜索引擎，用于调节分子功能。

J Biomol Struct Dyn. 2023 Apr;41(7):2937-2946. doi: 10.1080/07391102.2022.2042386. Epub 2022 Feb 26.

De novo drug design by iterative multiobjective deep reinforcement learning with graph-based molecular quality assessment.基于图的分子质量评估的迭代多目标深度强化学习的从头药物设计。

Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad157.

Large-scale prediction of human protein-protein interactions from amino acid sequence based on latent topic features.基于潜在主题特征的从氨基酸序列大规模预测人类蛋白质-蛋白质相互作用。

J Proteome Res. 2010 Oct 1;9(10):4992-5001. doi: 10.1021/pr100618t.

Exploratory studies of ab initio protein structure prediction: multiple copy simulated annealing, AMBER energy functions, and a generalized born/solvent accessibility solvation model.从头算蛋白质结构预测的探索性研究：多拷贝模拟退火、AMBER能量函数和广义玻恩/溶剂可及性溶剂化模型。

Proteins. 2002 Jan 1;46(1):128-46. doi: 10.1002/prot.10020.

RosettaAntibodyDesign (RAbD): A general framework for computational antibody design.罗塞塔抗体设计（RAbD）：一种通用的计算抗体设计框架。

PLoS Comput Biol. 2018 Apr 27;14(4):e1006112. doi: 10.1371/journal.pcbi.1006112. eCollection 2018 Apr.

Multiobjective optimization with a modified simulated annealing algorithm for external beam radiotherapy treatment planning.基于改进模拟退火算法的适形调强放疗治疗计划多目标优化

Med Phys. 2006 Dec;33(12):4718-29. doi: 10.1118/1.2390550.

Automated assignment of NMR spectra of macroscopically oriented proteins using simulated annealing.使用模拟退火对宏观取向蛋白质的 NMR 谱进行自动分配。

J Magn Reson. 2018 Aug;293:104-114. doi: 10.1016/j.jmr.2018.06.004. Epub 2018 Jun 17.

本文引用的文献

Single-sequence-based prediction of protein secondary structures and solvent accessibility by deep whole-sequence learning.基于单序列的深度学习全序列预测蛋白质二级结构和溶剂可及性。

J Comput Chem. 2018 Oct 5;39(26):2210-2216. doi: 10.1002/jcc.25534. Epub 2018 Oct 14.

Massively parallel de novo protein design for targeted therapeutics.用于靶向治疗的大规模并行从头蛋白质设计。

Nature. 2017 Oct 5;550(7674):74-79. doi: 10.1038/nature23912. Epub 2017 Sep 27.

Computational protein design: a review.计算蛋白质设计：综述

J Phys Condens Matter. 2017 Apr 12;29(14):143001. doi: 10.1088/1361-648X/aa5c76. Epub 2017 Jan 31.

An Evolution-Based Approach to De Novo Protein Design.一种基于进化的从头蛋白质设计方法。

Methods Mol Biol. 2017;1529:243-264. doi: 10.1007/978-1-4939-6637-0_12.

Protein design with a comprehensive statistical energy function and boosted by experimental selection for foldability.利用综合统计能量函数进行蛋白质设计，并通过实验选择进行折叠能力增强。

Nat Commun. 2014 Oct 27;5:5330. doi: 10.1038/ncomms6330.

An evolution-based approach to De Novo protein design and case study on Mycobacterium tuberculosis.基于进化的从头蛋白质设计方法及结核分枝杆菌案例研究。

PLoS Comput Biol. 2013 Oct;9(10):e1003298. doi: 10.1371/journal.pcbi.1003298. Epub 2013 Oct 24.

Analysis of non-uniformly sampled spectra with multi-dimensional decomposition.基于多维分解的非均匀采样光谱分析

Prog Nucl Magn Reson Spectrosc. 2011 Oct;59(3):271-92. doi: 10.1016/j.pnmrs.2011.02.002. Epub 2011 Feb 24.

Accelerated NMR spectroscopy by using compressed sensing.利用压缩感知加速核磁共振波谱学

Angew Chem Int Ed Engl. 2011 Jun 6;50(24):5556-9. doi: 10.1002/anie.201100370. Epub 2011 Apr 29.

The RCSB Protein Data Bank: redesigned web site and web services.RCSB蛋白质数据库：重新设计的网站和网络服务。

Nucleic Acids Res. 2011 Jan;39(Database issue):D392-401. doi: 10.1093/nar/gkq1021. Epub 2010 Oct 29.

Sequence and structural analysis of two designed proteins with 88% identity adopting different folds.采用不同折叠方式的 88%同源性两种设计蛋白的序列和结构分析。

Protein Eng Des Sel. 2010 Dec;23(12):911-8. doi: 10.1093/protein/gzq070. Epub 2010 Oct 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

量化连续序列空间中蛋白质设计的多目标启发式算法

Multiobjective heuristic algorithm for protein design in a quantified continuous sequence space.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献