Department of Chemistry and Biomolecular Sciences and ARC Centre of Excellence in Bioinformatics, Macquarie University, Sydney NSW 2109, Australia.
BMC Genomics. 2012;13 Suppl 7(Suppl 7):S8. doi: 10.1186/1471-2164-13-S7-S8. Epub 2012 Dec 13.
Helminths are important socio-economic organisms, responsible for causing major parasitic infections in humans, other animals and plants. These infections impose a significant public health and economic burden globally. Exceptionally, some helminth organisms like Caenorhabditis elegans are free-living in nature and serve as model organisms for studying parasitic infections. Excretory/secretory proteins play an important role in parasitic helminth infections which make these proteins attractive targets for therapeutic use. In the case of helminths, large volume of expressed sequence tags (ESTs) has been generated to understand parasitism at molecular level and for predicting excretory/secretory proteins for developing novel strategies to tackle parasitic infections. However, mostly predicted ES proteins are not available for further analysis and there is no repository available for such predicted ES proteins. Furthermore, predictions have, in the main, focussed on classical secretory pathways while it is well established that helminth parasites also utilise non-classical secretory pathways.
We developed a free Helminth Secretome Database (HSD), which serves as a repository for ES proteins predicted using classical and non-classical secretory pathways, from EST data for 78 helminth species (64 nematodes, 7 trematodes and 7 cestodes) ranging from parasitic to free-living organisms. Approximately 0.9 million ESTs compiled from the largest EST database, dbEST were cleaned, assembled and analysed by different computational tools in our bioinformatics pipeline and predicted ES proteins were submitted to HSD.
We report the large-scale prediction and analysis of classically and non-classically secreted ES proteins from diverse helminth organisms. All the Unigenes (contigs and singletons) and excretory/secretory protein datasets generated from this analysis are freely available. A BLAST server is available at http://estexplorer.biolinfo.org/hsd, for checking the sequence similarity of new protein sequences against predicted helminth ES proteins.
寄生虫是重要的社会经济生物,可导致人类、其他动物和植物的主要寄生虫感染。这些感染在全球范围内造成了重大的公共卫生和经济负担。特殊情况下,一些寄生虫生物,如秀丽隐杆线虫,在自然界中是自由生活的,并作为研究寄生虫感染的模式生物。排泄/分泌蛋白在寄生虫感染中起着重要作用,这使得这些蛋白成为治疗用途的有吸引力的靶标。在寄生虫的情况下,已经生成了大量的表达序列标签 (EST),以在分子水平上了解寄生现象,并预测排泄/分泌蛋白,以开发新的策略来解决寄生虫感染。然而,大多数预测的 ES 蛋白无法进行进一步分析,并且没有可用的此类预测的 ES 蛋白存储库。此外,预测主要集中在经典分泌途径上,而众所周知,寄生虫也利用非经典分泌途径。
我们开发了一个免费的寄生虫分泌组数据库 (HSD),该数据库是一个存储库,用于存储从 78 种寄生虫物种 (64 种线虫、7 种吸虫和 7 种绦虫) 的 EST 数据中使用经典和非经典分泌途径预测的 ES 蛋白,这些寄生虫从寄生生物到自由生活生物不等。从最大的 EST 数据库 dbEST 中编译了大约 900 万个 EST,这些 EST 通过我们的生物信息学管道中的不同计算工具进行清理、组装和分析,并预测了 ES 蛋白,然后提交给 HSD。
我们报告了来自不同寄生虫生物的经典和非经典分泌 ES 蛋白的大规模预测和分析。从这项分析中生成的所有 Unigenes(丛和单)和排泄/分泌蛋白数据集都是免费提供的。BLAST 服务器可在 http://estexplorer.biolinfo.org/hsd 上获得,用于检查新蛋白质序列与预测的寄生虫 ES 蛋白的序列相似性。