Suppr超能文献

MMpred:一种用于从头蛋白质结构预测的距离辅助多模态构象采样方法

MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction.

作者信息

Zhao Kai-Long, Liu Jun, Zhou Xiao-Gen, Su Jian-Zhong, Zhang Yang, Zhang Gui-Jun

机构信息

College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China.

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA.

出版信息

Bioinformatics. 2021 Dec 7;37(23):4350-4356. doi: 10.1093/bioinformatics/btab484.

Abstract

MOTIVATION

The mathematically optimal solution in computational protein folding simulations does not always correspond to the native structure, due to the imperfection of the energy force fields. There is therefore a need to search for more diverse suboptimal solutions in order to identify the states close to the native. We propose a novel multimodal optimization protocol to improve the conformation sampling efficiency and modeling accuracy of de novo protein structure folding simulations.

RESULTS

A distance-assisted multimodal optimization sampling algorithm, MMpred, is proposed for de novo protein structure prediction. The protocol consists of three stages: The first is a modal exploration stage, in which a structural similarity evaluation model DMscore is designed to control the diversity of conformations, generating a population of diverse structures in different low-energy basins. The second is a modal maintaining stage, where an adaptive clustering algorithm MNDcluster is proposed to divide the populations and merge the modal by adjusting the annealing temperature to locate the promising basins. In the last stage of modal exploitation, a greedy search strategy is used to accelerate the convergence of the modal. Distance constraint information is used to construct the conformation scoring model to guide sampling. MMpred is tested on a large set of 320 non-redundant proteins, where MMpred obtains models with TM-score≥0.5 on 291 cases, which is 28% higher than that of Rosetta guided with the same set of distance constraints. In addition, on 320 benchmark proteins, the enhanced version of MMpred (E-MMpred) has 167 targets better than trRosetta when the best of five models are evaluated. The average TM-score of the best model of E-MMpred is 0.732, which is comparable to trRosetta (0.730).

AVAILABILITY AND IMPLEMENTATION

The source code and executable are freely available at https://github.com/iobio-zjut/MMpred.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

由于能量力场的不完善,计算蛋白质折叠模拟中的数学最优解并不总是对应于天然结构。因此,需要寻找更多样化的次优解,以识别接近天然状态的结构。我们提出了一种新颖的多模态优化协议,以提高从头蛋白质结构折叠模拟的构象采样效率和建模准确性。

结果

提出了一种距离辅助的多模态优化采样算法MMpred,用于从头蛋白质结构预测。该协议包括三个阶段:第一个是模态探索阶段,其中设计了一个结构相似性评估模型DMscore来控制构象的多样性,在不同的低能量盆地中生成一组多样的结构。第二个是模态维持阶段,提出了一种自适应聚类算法MNDcluster来划分种群并通过调整退火温度来合并模态,以定位有前景的盆地。在模态利用的最后阶段,使用贪婪搜索策略来加速模态的收敛。距离约束信息用于构建构象评分模型以指导采样。MMpred在一组320个非冗余蛋白质上进行了测试,其中MMpred在291个案例中获得了TM分数≥0.5的模型,比使用相同距离约束集指导的Rosetta高出28%。此外,在320个基准蛋白质上,当评估五个模型中的最佳模型时,MMpred的增强版(E-MMpred)比trRosetta有167个目标更好。E-MMpred最佳模型的平均TM分数为0.732,与trRosetta(0.730)相当。

可用性和实现

源代码和可执行文件可在https://github.com/iobio-zjut/MMpred上免费获得。

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验