Marin Antoine, Pothier Joël, Zimmermann Karel, Gibrat Jean-François
Mathématique, Informatique et Génome, Centre de Recherche de Versailles, INRA, Route de St Cyr, 78026 Versailles, Cedex, France.
Proteins. 2002 Dec 1;49(4):493-509. doi: 10.1002/prot.10231.
To assess the reliability of fold assignments to protein sequences, we developed a fold recognition method called FROST (Fold Recognition-Oriented Search Tool) based on a series of filters and a database specifically designed as a benchmark for this new method under realistic conditions. This benchmark database consists of proteins for which there exists, at least, another protein with an extensively similar 3D structure in a database of representative 3D structures (i.e., more than 65% of the residues in both proteins can be structurally aligned). Because the testing of our method must be carried out under conditions similar to those of real fold recognition experiments, no protein pair with sequence similarity detectable using standard sequence comparison methods such as FASTA is included in the benchmark database. While using FROST, we achieved a coverage of 60% for a rate of error of 1%. To obtain a baseline for our method, we used PSI-BLAST and 3D-PSSM. Under the same conditions, for a 1% error rate, coverages for PSI-BLAST and 3D-PSSM were 33 and 56%, respectively.
为了评估蛋白质序列折叠分配的可靠性,我们基于一系列筛选器和一个在实际条件下专门为此新方法设计的基准数据库,开发了一种名为FROST(面向折叠识别的搜索工具)的折叠识别方法。该基准数据库由这样一些蛋白质组成:在一个代表性三维结构数据库中,至少存在另一种具有广泛相似三维结构的蛋白质(即两种蛋白质中超过65%的残基可进行结构比对)。由于我们方法的测试必须在与实际折叠识别实验相似的条件下进行,因此基准数据库中不包括使用诸如FASTA等标准序列比较方法可检测到序列相似性的蛋白质对。使用FROST时,对于1%的错误率,我们实现了60%的覆盖率。为了获得我们方法的基线,我们使用了PSI-BLAST和3D-PSSM。在相同条件下,对于1%的错误率,PSI-BLAST和3D-PSSM的覆盖率分别为33%和56%。