Habermann Bianca, Oegema Jeffrey, Sunyaev Shamil, Shevchenko Andrej
Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, 01307 Dresden, Germany.
Mol Cell Proteomics. 2004 Mar;3(3):238-49. doi: 10.1074/mcp.M300073-MCP200. Epub 2003 Dec 26.
Mass spectrometry-driven BLAST (MS BLAST) is a database search protocol for identifying unknown proteins by sequence similarity to homologous proteins available in a database. MS BLAST utilizes redundant, degenerate, and partially inaccurate peptide sequence data obtained by de novo interpretation of tandem mass spectra and has become a powerful tool in functional proteomic research. Using computational modeling, we evaluated the potential of MS BLAST for proteome-wide identification of unknown proteins. We determined how the success rate of protein identification depends on the full-length sequence identity between the queried protein and its closest homologue in a database. We also estimated phylogenetic distances between organisms under study and related reference organisms with completely sequenced genomes that allow substantial coverage of unknown proteomes.
质谱驱动的BLAST(MS BLAST)是一种数据库搜索协议,用于通过与数据库中可用的同源蛋白质进行序列相似性比对来鉴定未知蛋白质。MS BLAST利用通过对串联质谱进行从头解释获得的冗余、简并和部分不准确的肽序列数据,已成为功能蛋白质组学研究中的强大工具。通过计算建模,我们评估了MS BLAST在全蛋白质组范围内鉴定未知蛋白质的潜力。我们确定了蛋白质鉴定的成功率如何取决于查询蛋白质与其在数据库中最接近的同源物之间的全长序列同一性。我们还估计了所研究的生物体与具有完全测序基因组的相关参考生物体之间的系统发育距离,这些参考生物体能够大量覆盖未知蛋白质组。