Vullo Alessandro, Bortolami Oscar, Pollastri Gianluca, Tosatto Silvio C E
School of Computer Science and Informatics, University College Dublin, Ireland.
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W164-8. doi: 10.1093/nar/gkl166.
Intrinsically disordered proteins have long stretches of their polypeptide chain, which do not adopt a single native structure composed of stable secondary and tertiary structure in the absence of binding partners. The prediction of intrinsically disordered regions in proteins from sequence is increasingly becoming of interest, as the presence of many such regions in the complete genome sequences are discovered and important functional roles are associated with them. We have developed a machine learning approach based on two support vector machines (SVM) to discriminate disordered regions from sequence. The SVM are trained and benchmarked on two sets, representing long and short disordered regions. A preliminary version of Spritz was shown to perform consistently well at the recent biannual CASP-6 experiment [Critical Assessment of Techniques for Protein Structure Prediction (CASP), 2004]. The fully developed Spritz method is freely available as a web server at http://distill.ucd.ie/spritz/ and http://protein.cribi.unipd.it/spritz/.
内在无序蛋白质的多肽链有很长一段,在没有结合伴侣的情况下,它们不会形成由稳定的二级和三级结构组成的单一天然结构。随着在完整基因组序列中发现许多这样的区域并发现它们具有重要的功能作用,从序列预测蛋白质中的内在无序区域越来越受到关注。我们开发了一种基于两个支持向量机(SVM)的机器学习方法,用于从序列中区分无序区域。这些支持向量机在两组数据上进行训练和基准测试,分别代表长无序区域和短无序区域。在最近的两年一次的CASP-6实验[蛋白质结构预测技术关键评估(CASP),2004年]中,Spritz的初步版本表现一直良好。完全开发的Spritz方法可作为网络服务器免费获取,网址为http://distill.ucd.ie/spritz/和http://protein.cribi.unipd.it/spritz/ 。