Biomolecular Medicine, Department of Surgery and Cancer, Faculty of Medicine, Sir Alexander Fleming Building, Imperial College London, London, SW7 2AZ, UK.
J Am Soc Mass Spectrom. 2012 May;23(5):779-91. doi: 10.1007/s13361-012-0340-z. Epub 2012 Feb 29.
The critical importance of employing sound statistical arguments when seeking to draw inferences from inexact measurements is well-established throughout the sciences. Yet fundamental statistical methods such as hypothesis testing can currently be applied to only a small subset of the data analytical problems encountered in LC/MS experiments. The means of inference that are more generally employed are based on a variety of heuristic techniques and a largely qualitative understanding of their behavior. In this article, we attempt to move towards a more formalized approach to the analysis of LC/TOFMS data by establishing some of the core concepts required for a detailed mathematical description of the data. Using arguments that are based on the fundamental workings of the instrument, we derive and validate a probability distribution that approximates that of the empirically obtained data and on the basis of which formal statistical tests can be constructed. Unlike many existing statistical models for MS data, the one presented here aims for rigor rather than generality. Consequently, the model is closely tailored to a particular type of TOF mass spectrometer although the general approach carries over to other instrument designs. Looking ahead, we argue that further improvements in our ability to characterize the data mathematically could enable us to address a wide range of data analytical problems in a statistically rigorous manner.
在寻求从不精确的测量中得出推论时,运用合理的统计论证的至关重要性在整个科学界都得到了充分证实。然而,假设检验等基本统计方法目前只能应用于 LC/MS 实验中遇到的数据分析问题的一小部分。更常用的推理方法基于各种启发式技术和对其行为的定性理解。在本文中,我们试图通过建立详细描述数据所需的一些核心概念,朝着更形式化的 LC/TOFMS 数据分析方法迈进。基于仪器的基本工作原理,我们推导出并验证了一个概率分布,该分布近似于经验获得的数据,并在此基础上可以构建正式的统计检验。与现有的许多 MS 数据统计模型不同,这里提出的模型旨在追求严谨性而不是通用性。因此,该模型虽然可以推广到其他仪器设计,但它是针对特定类型的 TOF 质谱仪量身定制的。展望未来,我们认为,进一步提高我们对数据进行数学描述的能力,将使我们能够以统计学上严格的方式解决广泛的数据分析问题。