Hendriks Margriet M W B, Smit Suzanne, Akkermans Wies L M W, Reijmers Theo H, Eilers Paul H C, Hoefsloot Huub C J, Rubingh Carina M, de Koster Chris G, Aerts Johannes M, Smilde Age K
ABC Metabolomics Centre, University of Utrecht, Utrecht, The Netherlands.
Proteomics. 2007 Oct;7(20):3672-80. doi: 10.1002/pmic.200700046.
SELDI-TOF-MS is rapidly gaining popularity as a screening tool for clinical applications of proteomics. Application of adequate statistical techniques in all the stages from measurement to information is obligatory. One of the statistical methods often used in proteomics is classification: the assignment of subjects to discrete categories, for example healthy or diseased. Lately, many new classification methods have been developed, often specifically for the analysis of X-omics data. For proteomics studies a good strategy for evaluating classification results is of prime importance, because usually the number of objects will be small and it would be wasteful to set aside part of these as a 'mere' test set. The present paper offers such a strategy in the form of a protocol which can be used for choosing among different statistical classification methods and obtaining figures of merit of their performance. This paper also illustrates the usefulness of proteomics in a clinical setting, serum samples from Gaucher disease patients, when used in combination with an appropriate classification method.
表面增强激光解吸电离飞行时间质谱(SELDI-TOF-MS)作为蛋白质组学临床应用的筛查工具正迅速受到欢迎。在从测量到信息的所有阶段应用适当的统计技术是必不可少的。蛋白质组学中经常使用的一种统计方法是分类:将受试者分配到离散类别,例如健康或患病。最近,已经开发了许多新的分类方法,通常专门用于分析“X组学”数据。对于蛋白质组学研究,评估分类结果的良好策略至关重要,因为通常对象数量会很少,将其中一部分留作“纯粹”的测试集是浪费的。本文以协议的形式提供了这样一种策略,可用于在不同的统计分类方法中进行选择并获得其性能的品质因数。本文还说明了蛋白质组学在临床环境中的实用性,即当与适当的分类方法结合使用时,戈谢病患者的血清样本。