Serang Oliver, Noble William
University of Washington Department of Genome Sciences USA.
Stat Interface. 2012;5(1):3-20. doi: 10.4310/sii.2012.v5.n1.a2.
Tandem mass spectrometry has emerged as a powerful tool for the characterization of complex protein samples, an increasingly important problem in biology. The effort to efficiently and accurately perform inference on data from tandem mass spectrometry experiments has resulted in several statistical methods. We use a common framework to describe the predominant methods and discuss them in detail. These methods are classified using the following categories: set cover methods, iterative methods, and Bayesian methods. For each method, we analyze and evaluate the outcome and methodology of published comparisons to other methods; we use this comparison to comment on the qualities and weaknesses, as well as the overall utility, of all methods. We discuss the similarities between these methods and suggest directions for the field that would help unify these similar assumptions in a more rigorous manner and help enable efficient and reliable protein inference.
串联质谱已成为表征复杂蛋白质样品的强大工具,这在生物学中是一个日益重要的问题。为了对串联质谱实验数据进行高效准确的推断,人们开发了几种统计方法。我们使用一个通用框架来描述主要方法并进行详细讨论。这些方法可分为以下几类:集合覆盖方法、迭代方法和贝叶斯方法。对于每种方法,我们分析和评估已发表的与其他方法比较的结果和方法;我们利用这种比较来评论所有方法的优缺点以及整体效用。我们讨论这些方法之间的相似性,并为该领域提出方向,这将有助于以更严谨的方式统一这些相似的假设,并有助于实现高效可靠的蛋白质推断。