School of Computer Science and Engineering, Hebrew University of Jerusalem, Israel.
Bioinformatics. 2010 Sep 15;26(18):i482-8. doi: 10.1093/bioinformatics/btq375.
Animal toxins operate by binding to receptors and ion channels. These proteins are short and vary in sequence, structure and function. Sporadic discoveries have also revealed endogenous toxin-like proteins in non-venomous organisms. Viral proteins are the largest group of quickly evolving proteomes. We tested the hypothesis that toxin-like proteins exist in viruses and that they act to modulate functions of their hosts.
We updated and improved a classifier for compact proteins resembling short animal toxins that is based on a machine-learning method. We applied it in a large-scale setting to identify toxin-like proteins among short viral proteins. Among the approximately 26 000 representatives of such short proteins, 510 sequences were positively identified. We focused on the 19 highest scoring proteins. Among them, we identified conotoxin-like proteins, growth factors receptor-like proteins and anti-bacterial peptides. Our predictor was shown to enhance annotation inference for many 'uncharacterized' proteins. We conclude that our protocol can expose toxin-like proteins in unexplored niches including metagenomics data and enhance the systematic discovery of novel cell modulators for drug development.
ClanTox is available at http://www.clantox.cs.huji.ac.il.
动物毒素通过与受体和离子通道结合来发挥作用。这些蛋白质很短,在序列、结构和功能上都有所不同。偶尔的发现也揭示了非毒液生物中存在内源性类似毒素的蛋白质。病毒蛋白是进化最快的蛋白质组中最大的一组。我们检验了这样一个假设,即类似毒素的蛋白质存在于病毒中,并作用于调节其宿主的功能。
我们更新并改进了一种基于机器学习方法的短动物毒素类似紧凑型蛋白质分类器。我们将其应用于大规模的短病毒蛋白中,以识别类似毒素的蛋白质。在大约 26000 个这样的短蛋白代表中,有 510 个序列被正面鉴定。我们专注于得分最高的 19 个蛋白质。其中,我们鉴定出了类似芋螺毒素的蛋白质、生长因子受体样蛋白和抗细菌肽。我们的预测器被证明可以增强对许多“未表征”蛋白质的注释推断。我们的结论是,我们的方案可以揭示包括宏基因组学数据在内的未知生态位中的类似毒素蛋白质,并增强用于药物开发的新型细胞调节剂的系统发现。
ClanTox 可在 http://www.clantox.cs.huji.ac.il 获得。