Mizianty Marcin J, Peng Zhenling, Kurgan Lukasz
Department of Electrical and Computer Engineering; University of Alberta; Edmonton, AB Canada.
Intrinsically Disord Proteins. 2013 Apr 1;1(1):e24428. doi: 10.4161/idp.24428. eCollection 2013 Jan-Dec.
Intrinsically disordered proteins (IDPs) are either entirely disordered or contain disordered regions in their native state. IDPs were found to be abundant in complex organisms and implicated in numerous cellular processes. Experimental annotation of disorder lags behind the rapidly growing sizes of the protein databases, and thus computational methods are used to close this gap and to investigate the disorder. MFDp2 is a novel content-rich and user-friendly web server for sequence-based prediction of protein disorder that builds upon our residue-level disorder predictor MFDp and chain-level disorder content predictor DisCon. It applies novel post-processing filters and uses sequence alignment to improve predictive quality. Using a new benchmark data set, which has reduced sequence identity to corresponding training data sets, MFDp2 is shown to provide competitive predictive quality when compared with MFDp and a comprehensive set of 13 other state-of-the-art predictors, including publicly available versions of the top predictors from CASP9. Our server obtains the highest Mathews Correlation Coefficient (MCC) and the second best Area Under the receiver operating characteristic Curve (AUC). In addition to the disorder predictions, our server also outputs well-described sequence-derived information that allows profiling the predicted disorder. We conveniently visualize sequence conservation, predicted secondary structure, relative solvent accessibility and alignments to chains with annotated disorder. We allow predictions for multiple proteins at the same time and each prediction can be downloaded as text-based (parsable) file. The web server, which includes help pages and tutorial, is freely available at biomine.ece.ualberta.ca/MFDp2/.
内在无序蛋白(IDP)在其天然状态下要么完全无序,要么包含无序区域。人们发现IDP在复杂生物体中大量存在,并参与众多细胞过程。对无序的实验注释滞后于蛋白质数据库的快速增长,因此使用计算方法来弥补这一差距并研究无序情况。MFDp2是一个新颖的、内容丰富且用户友好的网络服务器,用于基于序列预测蛋白质无序,它基于我们的残基水平无序预测器MFDp和链水平无序含量预测器DisCon构建。它应用了新颖的后处理过滤器,并使用序列比对来提高预测质量。使用一个新的基准数据集,该数据集与相应训练数据集的序列同一性降低,结果表明与MFDp以及包括来自CASP9的顶级预测器的公开可用版本在内的其他13种最先进的综合预测器相比,MFDp2具有竞争力的预测质量。我们的服务器获得了最高的马修斯相关系数(MCC)和第二好的受试者工作特征曲线下面积(AUC)。除了无序预测外,我们的服务器还输出详细描述的序列衍生信息,从而能够对预测的无序进行分析。我们方便地可视化序列保守性、预测的二级结构、相对溶剂可及性以及与带注释无序的链的比对。我们允许同时对多个蛋白质进行预测,并且每个预测都可以作为基于文本(可解析)的文件下载。该网络服务器包括帮助页面和教程,可在biomine.ece.ualberta.ca/MFDp2/免费获取。