Uziela Karolis, Menéndez Hurtado David, Shu Nanjiang, Wallner Björn, Elofsson Arne
Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Solna, Sweden.
Bioinformatics Short-term Support and Infrastructure (BILS), Science for Life Laboratory, Solna, Sweden.
Bioinformatics. 2017 May 15;33(10):1578-1580. doi: 10.1093/bioinformatics/btw819.
Protein quality assessment is a long-standing problem in bioinformatics. For more than a decade we have developed state-of-art predictors by carefully selecting and optimising inputs to a machine learning method. The correlation has increased from 0.60 in ProQ to 0.81 in ProQ2 and 0.85 in ProQ3 mainly by adding a large set of carefully tuned descriptions of a protein. Here, we show that a substantial improvement can be obtained using exactly the same inputs as in ProQ2 or ProQ3 but replacing the support vector machine by a deep neural network. This improves the Pearson correlation to 0.90 (0.85 using ProQ2 input features).
ProQ3D is freely available both as a webserver and a stand-alone program at http://proq3.bioinfo.se/.
Supplementary data are available at Bioinformatics online.
蛋白质质量评估是生物信息学中一个长期存在的问题。十多年来,我们通过精心选择和优化机器学习方法的输入,开发了先进的预测器。相关性已从ProQ中的0.60提高到ProQ2中的0.81和ProQ3中的0.85,主要是通过添加大量经过精心调整的蛋白质描述。在此,我们表明,使用与ProQ2或ProQ3完全相同的输入,但将支持向量机替换为深度神经网络,可以取得实质性的改进。这将皮尔逊相关性提高到了0.90(使用ProQ2输入特征时为0.85)。
ProQ3D可作为网络服务器和独立程序在http://proq3.bioinfo.se/上免费获取。
补充数据可在《生物信息学》在线获取。