McHugh Leo, Arthur Jonathan W
Discipline of Medicine and Sydney Bioinformatics, University of Sydney, Sydney, New South Wales, Australia.
PLoS Comput Biol. 2008 Feb;4(2):e12. doi: 10.1371/journal.pcbi.0040012.
Protein identification using mass spectrometry is an indispensable computational tool in the life sciences. A dramatic increase in the use of proteomic strategies to understand the biology of living systems generates an ongoing need for more effective, efficient, and accurate computational methods for protein identification. A wide range of computational methods, each with various implementations, are available to complement different proteomic approaches. A solid knowledge of the range of algorithms available and, more critically, the accuracy and effectiveness of these techniques is essential to ensure as many of the proteins as possible, within any particular experiment, are correctly identified. Here, we undertake a systematic review of the currently available methods and algorithms for interpreting, managing, and analyzing biological data associated with protein identification. We summarize the advances in computational solutions as they have responded to corresponding advances in mass spectrometry hardware. The evolution of scoring algorithms and metrics for automated protein identification are also discussed with a focus on the relative performance of different techniques. We also consider the relative advantages and limitations of different techniques in particular biological contexts. Finally, we present our perspective on future developments in the area of computational protein identification by considering the most recent literature on new and promising approaches to the problem as well as identifying areas yet to be explored and the potential application of methods from other areas of computational biology.
使用质谱法进行蛋白质鉴定是生命科学中不可或缺的计算工具。为了解生命系统的生物学特性,蛋白质组学策略的使用急剧增加,这就持续需要更有效、高效和准确的蛋白质鉴定计算方法。有各种各样的计算方法,每种方法都有不同的实现方式,可用于补充不同的蛋白质组学方法。扎实了解可用算法的范围,更关键的是,了解这些技术的准确性和有效性,对于确保在任何特定实验中尽可能多的蛋白质被正确鉴定至关重要。在此,我们对当前可用的用于解释、管理和分析与蛋白质鉴定相关的生物学数据的方法和算法进行系统综述。我们总结了计算解决方案的进展,因为它们响应了质谱硬件的相应进展。还讨论了用于自动蛋白质鉴定的评分算法和指标的演变,重点是不同技术的相对性能。我们还考虑了不同技术在特定生物学背景下的相对优势和局限性。最后,我们通过考虑关于该问题的新的和有前景的方法的最新文献,以及确定尚未探索的领域和计算生物学其他领域方法的潜在应用,对计算蛋白质鉴定领域的未来发展提出我们的看法。