Liu Song, Zhang Chi, Liang Shide, Zhou Yaoqi
Howard Hughes Medical Institute Center for Single Molecule Biophysics, Department of Physiology and Biophysics, State University of New York at Buffalo, Buffalo, New York 14214, USA.
Proteins. 2007 Aug 15;68(3):636-45. doi: 10.1002/prot.21459.
Recognizing the structural similarity without significant sequence identity (called fold recognition) is the key for bridging the gap between the number of known protein sequences and the number of structures solved. Previously, we developed a fold-recognition method called SP(3) which combines sequence-derived sequence profiles, secondary-structure profiles and residue-depth dependent, structure-derived sequence profiles. The use of residue-depth-dependent profiles makes SP(3) one of the best automatic predictors in CASP 6. Because residue depth (RD) and solvent accessible surface area (solvent accessibility) are complementary in describing the exposure of a residue to solvent, we test whether or not incorporation of solvent-accessibility profiles into SP(3) could further increase the accuracy of fold recognition. The resulting method, called SP(4), was tested in SALIGN benchmark for alignment accuracy and Lindahl, LiveBench 8 and CASP7 blind prediction for fold recognition sensitivity and model-structure accuracy. For remote homologs, SP(4) is found to consistently improve over SP(3) in the accuracy of sequence alignment and predicted structural models as well as in the sensitivity of fold recognition. Our result suggests that RD and solvent accessibility can be used concurrently for improving the accuracy and sensitivity of fold recognition. The SP(4) server and its local usage package are available on http://sparks.informatics.iupui.edu/SP4.
识别没有显著序列同一性的结构相似性(称为折叠识别)是弥合已知蛋白质序列数量与已解析结构数量之间差距的关键。此前,我们开发了一种称为SP(3)的折叠识别方法,该方法结合了源自序列的序列概况、二级结构概况以及依赖于残基深度的、源自结构的序列概况。使用依赖于残基深度的概况使SP(3)成为CASP 6中最佳的自动预测工具之一。由于残基深度(RD)和溶剂可及表面积(溶剂可及性)在描述残基对溶剂的暴露情况方面具有互补性,我们测试了将溶剂可及性概况纳入SP(3)是否能进一步提高折叠识别的准确性。由此产生的方法称为SP(4),在SALIGN基准测试中测试了比对准确性,并在Lindahl、LiveBench 8和CASP7盲预测中测试了折叠识别灵敏度和模型结构准确性。对于远源同源物,发现SP(4)在序列比对准确性、预测结构模型以及折叠识别灵敏度方面均持续优于SP(3)。我们的结果表明,RD和溶剂可及性可同时用于提高折叠识别的准确性和灵敏度。SP(4)服务器及其本地使用包可在http://sparks.informatics.iupui.edu/SP4获取。