Brief Bioinform. 2021 Mar 22;22(2):1620-1638. doi: 10.1093/bib/bbz163.
Statistically, accurate protein identification is a fundamental cornerstone of proteomics and underpins the understanding and application of this technology across all elements of medicine and biology. Proteomics, as a branch of biochemistry, has in recent years played a pivotal role in extending and developing the science of accurately identifying the biology and interactions of groups of proteins or proteomes. Proteomics has primarily used mass spectrometry (MS)-based techniques for identifying proteins, although other techniques including affinity-based identifications still play significant roles. Here, we outline the basics of MS to understand how data are generated and parameters used to inform computational tools used in protein identification. We then outline a comprehensive analysis of the bioinformatics and computational methodologies used in protein identification in proteomics including discussing the most current communally acceptable metrics to validate any identification.
从统计学角度来看,准确的蛋白质鉴定是蛋白质组学的基本基石,它支撑着医学和生物学各个领域对该技术的理解和应用。蛋白质组学作为生物化学的一个分支,近年来在扩展和发展准确鉴定蛋白质组或蛋白质组学中蛋白质的生物学和相互作用方面发挥了关键作用。蛋白质组学主要使用基于质谱(MS)的技术来鉴定蛋白质,尽管包括基于亲和力的鉴定在内的其他技术仍发挥着重要作用。在这里,我们概述了 MS 的基础知识,以了解如何生成数据以及使用哪些参数来告知蛋白质鉴定中使用的计算工具。然后,我们概述了蛋白质组学中用于蛋白质鉴定的生物信息学和计算方法学的综合分析,包括讨论当前可接受的用于验证任何鉴定的指标。