Zarei Roghayeh, Arab Shahriar, Sadeghi Mehdi
Khatam University, Tehran, Iran.
Comput Biol Chem. 2007 Oct;31(5-6):384-8. doi: 10.1016/j.compbiolchem.2007.08.006. Epub 2007 Aug 19.
Prediction of protein accessibility from sequence, as prediction of protein secondary structure is an intermediate step for predicting structures and consequently functions of proteins. Most of the currently used methods are based on single residue prediction, either by statistical means or evolutionary information, and accessibility state of central residue in a window predicted. By expansion of databases of proteins with known 3D structures, we extracted information of pairwise residue types and conformational states of pairs simultaneously. For solving the problem of ambiguity in state prediction by one residue window sliding, we used dynamic programming algorithm to find the path with maximum score. The three state overall per-residue accuracy, Q(3), of this method in a Jackknife test with dataset of known proteins is more than 65% which is an improvement on results of methods based on evolutionary information.
从序列预测蛋白质可及性,如同预测蛋白质二级结构一样,是预测蛋白质结构进而功能的中间步骤。目前大多数使用的方法基于单个残基预测,要么通过统计方法,要么利用进化信息,并预测窗口中中心残基的可及性状态。通过扩展具有已知三维结构的蛋白质数据库,我们同时提取了成对残基类型和对的构象状态的信息。为了解决通过单残基窗口滑动进行状态预测时的模糊性问题,我们使用动态规划算法来找到得分最高的路径。在对已知蛋白质数据集进行的留一法测试中,该方法的三态全残基准确率Q(3)超过65%,这比基于进化信息的方法的结果有所提高。