Faculty of Bioscience Engineering, Laboratory for Bioinformatics and Computational Genomics, Ghent University, Ghent, Belgium.
J Proteome Res. 2012 May 4;11(5):2774-85. doi: 10.1021/pr201114m. Epub 2012 Mar 30.
Many genomes of nonmodel organisms are yet to be annotated. Peptidomics research on those organisms therefore cannot adopt the commonly used database-driven identification strategy, leaving the more difficult de novo sequencing approach as the only alternative. The reported tool uses the growing resources of publicly or in-house available fragmentation spectra and sequences of (model) organisms to elucidate the identity of peptides of experimental spectra of nonannotated species. Clustering algorithms are implemented to infer the identity of unknown peak lists based on their publicly or in-house available counterparts. The reported tool, which we call the HomClus-tool, can cope with post-translational modifications and amino acid substitutions. We applied this tool on two locusts (Schistocerca gregaria and Locusta migratoria) LC-MALDI-TOF/TOF datasets. Compared to a Mascot database search (using the available UniProt-KB proteins of these species), we were able to double the amount of peptide identifications for both spectral sets. Known bioactive peptides from Drosophila melanogaster (i.e., fragmentations spectra generated in silico thereof) were used as a starting point for clustering, trying to reveal their experimental homologues' counterparts.
许多非模式生物的基因组尚未被注释。因此,针对这些生物体的肽组学研究不能采用常用的数据库驱动的鉴定策略,只能选择更困难的从头测序方法。所报道的工具利用不断增长的公开或内部可用的片段化光谱和(模式)生物体的序列,来阐明非注释物种实验光谱中肽的身份。聚类算法被用于根据它们的公开或内部可用的对应物来推断未知峰列表的身份。我们称之为 HomClus-tool 的这个工具可以处理翻译后修饰和氨基酸替换。我们将这个工具应用于两个蝗虫(Schistocerca gregaria 和 Locusta migratoria)LC-MALDI-TOF/TOF 数据集。与 Mascot 数据库搜索(使用这些物种的可用 UniProt-KB 蛋白质)相比,我们能够将两个光谱集的肽鉴定数量增加一倍。使用黑腹果蝇中的已知生物活性肽(即,由此产生的碎片谱)作为聚类的起点,试图揭示它们的实验同源物的对应物。