Suppr超能文献

TransportTP:一种用于膜转运蛋白预测和特征分析的两阶段分类方法。

TransportTP: a two-phase classification approach for membrane transporter prediction and characterization.

机构信息

Plant Biology Division, The Samuel Roberts Noble Foundation, Inc, Ardmore, OK 73401, USA.

出版信息

BMC Bioinformatics. 2009 Dec 14;10:418. doi: 10.1186/1471-2105-10-418.

Abstract

BACKGROUND

Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides.

RESULTS

In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation.

CONCLUSIONS

TransportTP is the most effective tool for eukaryotic transporter characterization up to date.

摘要

背景

膜转运蛋白在活细胞中起着至关重要的作用。转运蛋白的实验特征具有成本高、耗时长的特点。目前用于转运蛋白特征描述的计算方法仍然需要大量的编辑工作,特别是对于真核生物。我们开发了一种新的基于基因组规模的转运蛋白预测和特征描述系统,称为 TransportTP,它结合了同源性和机器学习方法,采用两阶段分类方法。首先,传统的同源方法用于基于与 Transporter Classification Database(TCDB)中已知分类蛋白的序列相似性来预测新的转运蛋白。其次,使用机器学习方法来整合多种特征来细化初始预测。一组基于转运蛋白特征的规则是通过机器学习使用精心编辑的蛋白质组作为指导来开发的。

结果

在使用酵母蛋白质组进行训练和其他十个生物体的蛋白质组进行测试的交叉验证中,TransportTP 在基于手动注释的转运蛋白数据库 TransportDB 上实现了召回率和精度均为 81.8%的等效值。在使用拟南芥蛋白质组进行训练和四个最近测序的植物蛋白质组进行测试的独立测试中,根据我们的手动编辑,它实现了 74.6%的召回率和 73.4%的精度。

结论

TransportTP 是迄今为止最有效的真核转运蛋白特征描述工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7cc/3087344/3fb0ae36c7f2/1471-2105-10-418-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验