Suppr超能文献

基于信号肽结构模型的结核分枝杆菌蛋白质组的计算比较研究。

Computational comparative study of tuberculosis proteomes using a model learned from signal peptide structures.

机构信息

Institute of Information Science, Academia Sinica, Taipei, Taiwan.

出版信息

PLoS One. 2012;7(4):e35018. doi: 10.1371/journal.pone.0035018. Epub 2012 Apr 9.

Abstract

Secretome analysis is important in pathogen studies. A fundamental and convenient way to identify secreted proteins is to first predict signal peptides, which are essential for protein secretion. However, signal peptides are highly complex functional sequences that are easily confused with transmembrane domains. Such confusion would obviously affect the discovery of secreted proteins. Transmembrane proteins are important drug targets, but very few transmembrane protein structures have been determined experimentally; hence, prediction of the structures is essential. In the field of structure prediction, researchers do not make assumptions about organisms, so there is a need for a general signal peptide predictor.To improve signal peptide prediction without prior knowledge of the associated organisms, we present a machine-learning method, called SVMSignal, which uses biochemical properties as features, as well as features acquired from a novel encoding, to capture biochemical profile patterns for learning the structures of signal peptides directly.We tested SVMSignal and five popular methods on two benchmark datasets from the SPdb and UniProt/Swiss-Prot databases, respectively. Although SVMSignal was trained on an old dataset, it performed well, and the results demonstrate that learning the structures of signal peptides directly is a promising approach. We also utilized SVMSignal to analyze proteomes in the entire HAMAP microbial database. Finally, we conducted a comparative study of secretome analysis on seven tuberculosis-related strains selected from the HAMAP database. We identified ten potential secreted proteins, two of which are drug resistant and four are potential transmembrane proteins.SVMSignal is publicly available at http://bio-cluster.iis.sinica.edu.tw/SVMSignal. It provides user-friendly interfaces and visualizations, and the prediction results are available for download.

摘要

分泌组分析在病原体研究中很重要。鉴定分泌蛋白的一种基本且方便的方法是首先预测信号肽,这是蛋白质分泌所必需的。然而,信号肽是高度复杂的功能序列,很容易与跨膜结构域混淆。这种混淆显然会影响分泌蛋白的发现。跨膜蛋白是重要的药物靶点,但很少有跨膜蛋白结构通过实验确定;因此,预测结构是必不可少的。在结构预测领域,研究人员不针对生物体做出假设,因此需要一种通用的信号肽预测器。为了在不事先了解相关生物体的情况下改进信号肽预测,我们提出了一种称为 SVMSignal 的机器学习方法,该方法使用生化特性作为特征,以及从新编码中获取的特征,直接捕获生化特征模式,用于学习信号肽的结构。我们在来自 SPdb 和 UniProt/Swiss-Prot 数据库的两个基准数据集上测试了 SVMSignal 和五个流行的方法。尽管 SVMSignal 是在旧数据集上进行训练的,但它表现良好,结果表明直接学习信号肽的结构是一种很有前途的方法。我们还利用 SVMSignal 分析了整个 HAMAP 微生物数据库中的蛋白质组。最后,我们对 HAMAP 数据库中选择的七种结核病相关菌株进行了分泌组分析的比较研究。我们鉴定了十个潜在的分泌蛋白,其中两个是耐药蛋白,四个是潜在的跨膜蛋白。SVMSignal 可在 http://bio-cluster.iis.sinica.edu.tw/SVMSignal 上获得。它提供了用户友好的界面和可视化,预测结果可下载。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验