Suppr超能文献

基于片段匹配的精确一维蛋白质结构预测新方法。

A novel method for accurate one-dimensional protein structure prediction based on fragment matching.

机构信息

Division of Structural Chemistry, Stockholm University, Stockholm SE-106 91, Sweden.

出版信息

Bioinformatics. 2010 Feb 15;26(4):470-7. doi: 10.1093/bioinformatics/btp679. Epub 2009 Dec 9.

Abstract

MOTIVATION

The precise prediction of one-dimensional (1D) protein structure as represented by the protein secondary structure and 1D string of discrete state of dihedral angles (i.e. Shape Strings) is a prerequisite for the successful prediction of three-dimensional (3D) structure as well as protein-protein interaction. We have developed a novel 1D structure prediction method, called Frag1D, based on a straightforward fragment matching algorithm and demonstrated its success in the prediction of three sets of 1D structural alphabets, i.e. the classical three-state secondary structure, three- and eight-state Shape Strings.

RESULTS

By exploiting the vast protein sequence and protein structure data available, we have brought secondary-structure prediction closer to the expected theoretical limit. When tested by a leave-one-out cross validation on a non-redundant set of PDB cutting at 30% sequence identity containing 5860 protein chains, the overall per-residue accuracy for secondary-structure prediction, i.e. Q3 is 82.9%. The overall per-residue accuracy for three- and eight-state Shape Strings are 85.1 and 71.5%, respectively. We have also benchmarked our program with the latest version of PSIPRED for secondary structure prediction and our program predicted 0.3% better in Q3 when tested on 2241 chains with the same training set. For Shape Strings, we compared our method with a recently published method with the same dataset and definition as used by that method. Our program predicted at 2.2% better in accuracy for three-state Shape Strings. By quantitatively investigating the effect of data base size on 1D structure prediction we show that the accuracy increases by approximately 1% with every doubling of the database size.

摘要

动机

一维(1D)蛋白质结构的精确预测,如蛋白质二级结构和离散二面角状态的 1D 字符串(即形状字符串)的预测,是成功预测三维(3D)结构和蛋白质-蛋白质相互作用的前提。我们开发了一种新的 1D 结构预测方法,称为 Frag1D,该方法基于直接的片段匹配算法,并在三组 1D 结构字母的预测中证明了其成功,即经典的三态二级结构、三和八态形状字符串。

结果

通过利用大量可用的蛋白质序列和蛋白质结构数据,我们使二级结构预测更接近预期的理论极限。在对非冗余 PDB 数据集进行 30%序列同一性的 5860 条蛋白质链的留一交叉验证中,二级结构预测的整体残基准确率,即 Q3 为 82.9%。三态和八态形状字符串的整体残基准确率分别为 85.1%和 71.5%。我们还将我们的程序与 PSIPRED 的最新版本进行了基准测试,用于二级结构预测,在使用相同的训练集对 2241 条链进行测试时,我们的程序在 Q3 中的预测准确率提高了 0.3%。对于形状字符串,我们将我们的方法与最近发表的方法进行了比较,该方法使用了相同的数据集和定义。我们的程序在三态形状字符串的准确率上提高了 2.2%。通过定量研究数据库大小对 1D 结构预测的影响,我们表明随着数据库大小增加约 1%,准确率提高约 1%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验