Suppr超能文献

SLPred:一种用于多定位人类蛋白质的多视图亚细胞定位预测工具。

SLPred: a multi-view subcellular localization prediction tool for multi-location human proteins.

作者信息

Özsarı Gökhan, Rifaioglu Ahmet Sureyya, Atakan Ahmet, Doğan Tunca, Martin Maria Jesus, Çetin Atalay Rengül, Atalay Volkan

机构信息

Department of Computer Engineering, Middle East Technical University, Ankara 06800, Turkey.

Department of Computer Engineering, Niğde Ömer Halisdemir University, Niğde 51240, Turkey.

出版信息

Bioinformatics. 2022 Sep 2;38(17):4226-4229. doi: 10.1093/bioinformatics/btac458.

Abstract

SUMMARY

Accurate prediction of the subcellular locations (SLs) of proteins is a critical topic in protein science. In this study, we present SLPred, an ensemble-based multi-view and multi-label protein subcellular localization prediction tool. For a query protein sequence, SLPred provides predictions for nine main SLs using independent machine-learning models trained for each location. We used UniProtKB/Swiss-Prot human protein entries and their curated SL annotations as our source data. We connected all disjoint terms in the UniProt SL hierarchy based on the corresponding term relationships in the cellular component category of Gene Ontology and constructed a training dataset that is both reliable and large scale using the re-organized hierarchy. We tested SLPred on multiple benchmarking datasets including our-in house sets and compared its performance against six state-of-the-art methods. Results indicated that SLPred outperforms other tools in the majority of cases.

AVAILABILITY AND IMPLEMENTATION

SLPred is available both as an open-access and user-friendly web-server (https://slpred.kansil.org) and a stand-alone tool (https://github.com/kansil/SLPred). All datasets used in this study are also available at https://slpred.kansil.org.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

蛋白质亚细胞定位(SLs)的准确预测是蛋白质科学中的一个关键课题。在本研究中,我们提出了SLPred,一种基于集成的多视图和多标签蛋白质亚细胞定位预测工具。对于一个查询蛋白质序列,SLPred使用针对每个位置训练的独立机器学习模型,对九个主要的亚细胞定位进行预测。我们使用UniProtKB/Swiss-Prot人类蛋白质条目及其经过整理的亚细胞定位注释作为我们的源数据。我们根据基因本体论细胞成分类别中的相应术语关系,连接了UniProt亚细胞定位层次结构中所有不相交的术语,并使用重新组织的层次结构构建了一个既可靠又大规模的训练数据集。我们在包括我们内部数据集在内的多个基准数据集上测试了SLPred,并将其性能与六种最先进的方法进行了比较。结果表明,在大多数情况下,SLPred的性能优于其他工具。

可用性和实现方式

SLPred既可以作为一个开放访问且用户友好的网络服务器(https://slpred.kansil.org)使用,也可以作为一个独立工具(https://github.com/kansil/SLPred)使用。本研究中使用的所有数据集也可在https://slpred.kansil.org获取。

补充信息

补充数据可在《生物信息学》在线版获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验