Suppr超能文献

序贯搜索可加快基于片段从头预测蛋白质结构的速度,提高预测效率。

Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction.

机构信息

Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.

Department of Informatics, UCB Pharma, Slough SL1 3WE, UK.

出版信息

Bioinformatics. 2018 Apr 1;34(7):1132-1140. doi: 10.1093/bioinformatics/btx722.

Abstract

MOTIVATION

Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally.

RESULTS

We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy.

AVAILABILITY AND IMPLEMENTATION

Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2.

CONTACT

saulo.deoliveira@dtc.ox.ac.uk.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

大多数当前的从头预测方法都是随机采样蛋白质构象,因此需要大量的计算资源。在这里,我们考虑一种顺序采样策略,该策略基于最近的实验工作的思想,该实验表明许多蛋白质是共翻译折叠的。

结果

我们已经研究了一种伪贪婪搜索方法,从其中一个末端开始顺序进行,是否可以提高从头预测蛋白质结构的性能和准确性。我们观察到,当生成的诱饵少于 20000 个时,我们的顺序方法就会收敛,生成的诱饵少于通常预期的数量。使用我们的软件 SAINT2,我们还比较了顺序和非顺序方法生成模型的运行时间和质量。顺序预测比非顺序预测生成单个诱饵快 1.5-2.5 倍。在考虑最佳模型的质量时,顺序预测导致 41 个可溶性蛋白验证案例中有 31 个产生了更好的模型,24 个跨膜蛋白案例中有 18 个产生了更好的模型。顺序模式产生了 29 个此类案例的正确模型(TM-Score > 0.5),而非顺序模式仅产生了 22 个。我们的比较表明,顺序搜索策略可用于大大减少从头预测蛋白质结构的计算时间并提高准确性。

可用性和实现

数据可从以下网址下载:http://opig.stats.ox.ac.uk/resources。SAINT2 可从以下网址下载:https://github.com/sauloho/SAINT2。

联系人

saulo.deoliveira@dtc.ox.ac.uk

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e00/6030820/267221ee0428/btx722f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验