Schymanski Emma L, Neumann Steffen
Eawag: Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, Dübendorf CH-8600, Switzerland.
IPB: Leibniz Institute of Plant Biochemistry, Department of Stress and Developmental Biology, Weinberg 3, Halle (Saale) DE-06120, Germany.
Metabolites. 2013 Jun 25;3(3):517-38. doi: 10.3390/metabo3030517.
The Critical Assessment of Small Molecule Identification, or CASMI, contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this article, the challenges and solutions for the inaugural CASMI 2012 are presented. The contest was split into four categories corresponding with tasks to determine molecular formula and molecular structure, each from two measurement types, liquid chromatography-high resolution mass spectrometry (LC-HRMS), where preference was given to high mass accuracy data, and gas chromatography-electron impact-mass spectrometry (GC-MS), i.e., unit accuracy data. These challenges were obtained from plant material, environmental samples and reference standards. It was surprisingly difficult to obtain data suitable for a contest, especially for GC-MS data where existing databases are very large. The level of difficulty of the challenges is thus quite varied. In this article, the challenges and the answers are discussed, and recommendations for challenge selection in subsequent CASMI contests are given.
小分子鉴定关键评估(CASMI)竞赛始于2012年,旨在为科学家提供一个通用的开放数据集,以评估他们的鉴定方法。本文介绍了2012年首届CASMI竞赛所面临的挑战及解决方案。竞赛分为四类,分别对应确定分子式和分子结构的任务,每种任务有两种测量类型,即液相色谱-高分辨率质谱(LC-HRMS),优先选用高质量准确度数据,以及气相色谱-电子轰击质谱(GC-MS),即单位准确度数据。这些挑战源自植物材料、环境样品和参考标准。令人惊讶的是,获取适合竞赛的数据非常困难,尤其是对于GC-MS数据而言,因为现有数据库非常庞大。因此,挑战的难度水平差异很大。本文讨论了挑战及答案,并给出了后续CASMI竞赛中挑战选择的建议。