Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), 85748 Garching, Germany
Department of Informatics, Bioinformatics-I12, TUM, 85748 Garching, Germany.
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W350-5. doi: 10.1093/nar/gku396. Epub 2014 May 21.
The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18=80±3% for eukaryotes and a six-state accuracy Q6=89±4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.
蛋白质亚细胞定位预测是阐明蛋白质功能的重要步骤。对于每个查询蛋白质序列,LocTree2 应用机器学习(轮廓核 SVM)预测真核生物的 18 种天然亚细胞定位、细菌的 6 种和古菌的 3 种。该方法输出一个分数,反映每个预测的可靠性。LocTree2 的性能与任何其他最先进的方法相当或更好。在这里,我们报告了 LocTree3 作为公共网络服务器的可用性。该服务器包括基于机器学习的 LocTree2,并通过添加基于同源性的推理对其进行了改进。在序列唯一数据上评估,LocTree3 达到了 18 种状态的准确性 Q18=80±3%(真核生物)和 6 种状态的准确性 Q6=89±4%(细菌)。该服务器接受从单个蛋白质序列到整个蛋白质组的提交。空载服务器的响应时间约为 90 秒,对于一个 300 个残基的真核蛋白,对于整个真核蛋白质组,不考虑比对的生成,则需要几个小时。对于超过 1000 个完全测序的生物体,预测结果可直接作为下载提供。网络服务器可在 http://www.rostlab.org/services/loctree3 获得。