Analytical Chemistry, Van't Hoff Institute for Molecular Sciences, University of Amsterdam , P.O. Box 94720, 1090 GE Amsterdam, The Netherlands.
Department of Analytics and Statistics, DSM Resolve , 6167 RD Geleen, The Netherlands.
Anal Chem. 2017 Jan 17;89(2):1212-1221. doi: 10.1021/acs.analchem.6b03678. Epub 2016 Dec 29.
In this work, a novel probabilistic untargeted feature detection algorithm for liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) using artificial neural network (ANN) is presented. The feature detection process is approached as a pattern recognition problem, and thus, ANN was utilized as an efficient feature recognition tool. Unlike most existing feature detection algorithms, with this approach, any suspected chromatographic profile (i.e., shape of a peak) can easily be incorporated by training the network, avoiding the need to perform computationally expensive regression methods with specific mathematical models. In addition, with this method, we have shown that the high-resolution raw data can be fully utilized without applying any arbitrary thresholds or data reduction, therefore improving the sensitivity of the method for compound identification purposes. Furthermore, opposed to existing deterministic (binary) approaches, this method rather estimates the probability of a feature being present/absent at a given point of interest, thus giving chance for all data points to be propagated down the data analysis pipeline, weighed with their probability. The algorithm was tested with data sets generated from spiked samples in forensic and food safety context and has shown promising results by detecting features for all compounds in a computationally reasonable time.
本工作提出了一种用于液相色谱-高分辨质谱联用(LC-HRMS)的新型概率无靶特征检测算法,该算法使用人工神经网络(ANN)。特征检测过程被视为模式识别问题,因此,ANN 被用作有效的特征识别工具。与大多数现有的特征检测算法不同,通过这种方法,可以通过训练网络轻松地纳入任何可疑的色谱轮廓(即峰的形状),而无需使用特定的数学模型进行计算成本高的回归方法。此外,通过这种方法,我们已经表明可以充分利用高分辨原始数据,而无需应用任何任意阈值或数据缩减,因此提高了该方法用于化合物识别的灵敏度。此外,与现有的确定性(二进制)方法相反,该方法估计给定感兴趣点存在/不存在特征的概率,从而使所有数据点都有机会在数据分析管道中传播,并根据其概率进行加权。该算法已通过在法医和食品安全背景下的加标样品数据集进行了测试,并通过在计算上合理的时间内检测所有化合物的特征,显示出了有前途的结果。