Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, NC, USA.
Agilent Technologies, Inc., Santa Clara, CA, USA.
Anal Bioanal Chem. 2021 Dec;413(30):7495-7508. doi: 10.1007/s00216-021-03713-w. Epub 2021 Oct 14.
With the increasing availability of high-resolution mass spectrometers, suspect screening and non-targeted analysis are becoming popular compound identification tools for environmental researchers. Samples of interest often contain a large (unknown) number of chemicals spanning the detectable mass range of the instrument. In an effort to separate these chemicals prior to injection into the mass spectrometer, a chromatography method is often utilized. There are numerous types of gas and liquid chromatographs that can be coupled to commercially available mass spectrometers. Depending on the type of instrument used for analysis, the researcher is likely to observe a different subset of compounds based on the amenability of those chemicals to the selected experimental techniques and equipment. It would be advantageous if this subset of chemicals could be predicted prior to conducting the experiment, in order to minimize potential false-positive and false-negative identifications. In this work, we utilize experimental datasets to predict the amenability of chemical compounds to detection with liquid chromatography-electrospray ionization-mass spectrometry (LC-ESI-MS). The assembled dataset totals 5517 unique chemicals either explicitly detected or not detected with LC-ESI-MS. The resulting detected/not-detected matrix has been modeled using specific molecular descriptors to predict which chemicals are amenable to LC-ESI-MS, and to which form(s) of ionization. Random forest models, including a measure of the applicability domain of the model for both positive and negative modes of the electrospray ionization source, were successfully developed. The outcome of this work will help to inform future suspect screening and non-targeted analyses of chemicals by better defining the potential LC-ESI-MS detectable chemical landscape of interest.
随着高分辨率质谱仪的日益普及,可疑筛选和非靶向分析正成为环境研究人员进行化合物鉴定的热门工具。感兴趣的样品通常包含大量(未知)跨越仪器可检测质量范围的化学物质。为了在将这些化学物质注入质谱仪之前对其进行分离,通常会使用色谱法。有许多种气相和液相色谱仪可与市售的质谱仪联用。根据用于分析的仪器类型,研究人员可能会根据这些化学物质对所选实验技术和设备的适用性观察到不同的化合物子集。如果能够在进行实验之前预测这组化学物质,将有助于减少潜在的假阳性和假阴性鉴定。在这项工作中,我们利用实验数据集来预测化合物对液相色谱-电喷雾电离-质谱(LC-ESI-MS)检测的适用性。组装的数据集共有 5517 种独特的化学物质,要么通过 LC-ESI-MS 明确检测到,要么未检测到。使用特定的分子描述符对所得的检测/未检测矩阵进行建模,以预测哪些化学物质适用于 LC-ESI-MS,以及适用于哪种电离形式。成功开发了随机森林模型,包括对电喷雾电离源的正模式和负模式的模型适用性域的度量。这项工作的结果将有助于通过更好地定义感兴趣的潜在 LC-ESI-MS 可检测化学物质景观,为未来的可疑筛选和非靶向分析化学品提供信息。