Suppr超能文献

螺旋体基因组中的脂蛋白计算预测

Lipoprotein computational prediction in spirochaetal genomes.

作者信息

Setubal João C, Reis Marcelo, Matsunaga James, Haake David A

机构信息

Virginia Bioinformatics Institute, Virginia Tech, Bioinformatics 1, Box 0477, Blacksburg, VA 24060-0477, USA.

Laboratório de Bioinformática, Instituto de Computação, Universidade Estadual de Campinas, Caixa Postal 6076, Campinas, SP 13084-071, Brazil.

出版信息

Microbiology (Reading). 2006 Jan;152(Pt 1):113-121. doi: 10.1099/mic.0.28317-0.

Abstract

Lipoproteins are of great interest in understanding the molecular pathogenesis of spirochaetes. Because spirochaete lipobox sequences exhibit more plasticity than those of other bacteria, application of existing prediction algorithms to emerging sequence data has been problematic. In this paper a novel lipoprotein prediction algorithm is described, designated SpLip, constructed as a hybrid of a lipobox weight matrix approach supplemented by a set of lipoprotein signal peptide rules allowing for conservative amino acid substitutions. Both the weight matrix and the rules are based on a training set of 28 experimentally verified spirochaetal lipoproteins. The performance of the SpLip algorithm was compared to that of the hidden Markov model-based LipoP program and the rules-based algorithm Psort for all predicted protein-coding genes of Leptospira interrogans sv. Copenhageni, L. interrogans sv. Lai, Borrelia burgdorferi, Borrelia garinii, Treponema pallidum and Treponema denticola. Psort sensitivity (13-35 %) was considerably less than that of SpLip (93-100 %) or LipoP (50-84 %) due in part to the requirement of Psort for Ala or Gly at the -1 position, a rule based on E. coli lipoproteins. The percentage of false-positive lipoprotein predictions by the LipoP algorithm (8-30 %) was greater than that of SpLip (0-1 %) or Psort (4-27 %), due in part to the lack of rules in LipoP excluding unprecedented amino acids such as Lys and Arg in the -1 position. This analysis revealed a higher number of predicted spirochaetal lipoproteins than was previously known. The improved performance of the SpLip algorithm provides a more accurate prediction of the complete lipoprotein repertoire of spirochaetes. The hybrid approach of supplementing weight matrix scoring with rules based on knowledge of protein secretion biochemistry may be a general strategy for development of improved prediction algorithms.

摘要

脂蛋白在理解螺旋体的分子发病机制方面具有重要意义。由于螺旋体脂框序列比其他细菌的脂框序列表现出更大的可塑性,将现有的预测算法应用于新出现的序列数据存在问题。本文描述了一种新的脂蛋白预测算法,称为SpLip,它是一种脂框权重矩阵方法与一组允许保守氨基酸替换的脂蛋白信号肽规则相结合的混合算法。权重矩阵和规则均基于一组28个经实验验证的螺旋体脂蛋白训练集。将SpLip算法的性能与基于隐马尔可夫模型的LipoP程序以及基于规则的算法Psort对问号钩端螺旋体哥本哈根株、问号钩端螺旋体赖株、伯氏疏螺旋体、伽氏疏螺旋体、梅毒螺旋体和齿垢密螺旋体所有预测的蛋白质编码基因的性能进行了比较。Psort的灵敏度(13 - 35%)远低于SpLip(93 - 100%)或LipoP(50 - 84%),部分原因是Psort要求在 - 1位置为丙氨酸或甘氨酸,这一规则基于大肠杆菌脂蛋白。LipoP算法预测的脂蛋白假阳性百分比(8 - 30%)高于SpLip(0 - 1%)或Psort(4 - 27%),部分原因是LipoP缺乏排除 - 1位置出现如赖氨酸和精氨酸等前所未有的氨基酸的规则。该分析揭示了比以前已知的更多的预测螺旋体脂蛋白数量。SpLip算法性能的提高为螺旋体完整脂蛋白库提供了更准确的预测。用基于蛋白质分泌生物化学知识的规则补充权重矩阵评分的混合方法可能是开发改进预测算法的通用策略。

相似文献

1
Lipoprotein computational prediction in spirochaetal genomes.螺旋体基因组中的脂蛋白计算预测
Microbiology (Reading). 2006 Jan;152(Pt 1):113-121. doi: 10.1099/mic.0.28317-0.

引用本文的文献

5
Leptospiral adhesins: from identification to future perspectives.钩端螺旋体粘附素:从鉴定到未来展望
Front Microbiol. 2024 Aug 13;15:1458655. doi: 10.3389/fmicb.2024.1458655. eCollection 2024.
6
Hitchhiker's Guide to .《漫步指南》。
J Bacteriol. 2024 Sep 19;206(9):e0011624. doi: 10.1128/jb.00116-24. Epub 2024 Aug 14.
9
Pathogenicity and virulence of . 的致病性和毒力。
Virulence. 2023 Dec;14(1):2265015. doi: 10.1080/21505594.2023.2265015. Epub 2023 Oct 9.

本文引用的文献

1
Comparative analysis of the Borrelia garinii genome.加氏疏螺旋体基因组的比较分析
Nucleic Acids Res. 2004 Nov 16;32(20):6038-46. doi: 10.1093/nar/gkh953. Print 2004.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验