Suppr超能文献

一种用于质谱中非目标筛选的特征检测算法参数优化的随机方法。

A stochastic approach for parameter optimization of feature detection algorithms for non-target screening in mass spectrometry.

作者信息

Sadia Mohammad, Boudguiyer Youssef, Helmus Rick, Seijo Marianne, Praetorius Antonia, Samanipour Saer

机构信息

Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands.

Van'T Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Amsterdam, The Netherlands.

出版信息

Anal Bioanal Chem. 2024 Jul 12. doi: 10.1007/s00216-024-05425-3.

Abstract

Feature detection plays a crucial role in non-target screening (NTS), requiring careful selection of algorithm parameters to minimize false positive (FP) features. In this study, a stochastic approach was employed to optimize the parameter settings of feature detection algorithms used in processing high-resolution mass spectrometry data. This approach was demonstrated using four open-source algorithms (OpenMS, SAFD, XCMS, and KPIC2) within the patRoon software platform for processing extracts from drinking water samples spiked with 46 per- and polyfluoroalkyl substances (PFAS). The designed method is based on a stochastic strategy involving random sampling from variable space and the use of Pearson correlation to assess the impact of each parameter on the number of detected suspect analytes. Using our approach, the optimized parameters led to improvement in the algorithm performance by increasing suspect hits in case of SAFD and XCMS, and reducing the total number of detected features (i.e., minimizing FP) for OpenMS. These improvements were further validated on three different drinking water samples as test dataset. The optimized parameters resulted in a lower false discovery rate (FDR%) compared to the default parameters, effectively increasing the detection of true positive features. This work also highlights the necessity of algorithm parameter optimization prior to starting the NTS to reduce the complexity of such datasets.

摘要

特征检测在非目标筛查(NTS)中起着至关重要的作用,需要仔细选择算法参数以尽量减少误报(FP)特征。在本研究中,采用了一种随机方法来优化处理高分辨率质谱数据时使用的特征检测算法的参数设置。在patRoon软件平台内使用四种开源算法(OpenMS、SAFD、XCMS和KPIC2)对加标了46种全氟和多氟烷基物质(PFAS)的饮用水样品提取物进行处理,展示了这种方法。所设计的方法基于一种随机策略,包括从变量空间进行随机采样,并使用皮尔逊相关性来评估每个参数对检测到的可疑分析物数量的影响。使用我们的方法,优化后的参数在SAFD和XCMS的情况下通过增加可疑命中数提高了算法性能,而对于OpenMS则减少了检测到的特征总数(即最小化FP)。在三个不同的饮用水样品作为测试数据集上进一步验证了这些改进。与默认参数相比,优化后的参数导致了更低的错误发现率(FDR%),有效地增加了真阳性特征的检测。这项工作还强调了在开始非目标筛查之前进行算法参数优化以降低此类数据集复杂性的必要性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验