McEachran Andrew D, Mansouri Kamel, Grulke Chris, Schymanski Emma L, Ruttkies Christoph, Williams Antony J
Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, NC, 27711, USA.
National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC, 27711, USA.
J Cheminform. 2018 Aug 30;10(1):45. doi: 10.1186/s13321-018-0299-2.
Chemical database searching has become a fixture in many non-targeted identification workflows based on high-resolution mass spectrometry (HRMS). However, the form of a chemical structure observed in HRMS does not always match the form stored in a database (e.g., the neutral form versus a salt; one component of a mixture rather than the mixture form used in a consumer product). Linking the form of a structure observed via HRMS to its related form(s) within a database will enable the return of all relevant variants of a structure, as well as the related metadata, in a single query. A Konstanz Information Miner (KNIME) workflow has been developed to produce structural representations observed using HRMS ("MS-Ready structures") and links them to those stored in a database. These MS-Ready structures, and associated mappings to the full chemical representations, are surfaced via the US EPA's Chemistry Dashboard ( https://comptox.epa.gov/dashboard/ ). This article describes the workflow for the generation and linking of ~ 700,000 MS-Ready structures (derived from ~ 760,000 original structures) as well as download, search and export capabilities to serve structure identification using HRMS. The importance of this form of structural representation for HRMS is demonstrated with several examples, including integration with the in silico fragmentation software application MetFrag. The structures, search, download and export functionality are all available through the CompTox Chemistry Dashboard, while the MetFrag implementation can be viewed at https://msbi.ipb-halle.de/MetFragBeta/ .
化学数据库搜索已成为许多基于高分辨率质谱(HRMS)的非靶向识别工作流程中的固定环节。然而,HRMS中观察到的化学结构形式并不总是与数据库中存储的形式相匹配(例如,中性形式与盐形式;混合物的一个成分而非消费品中使用的混合物形式)。将通过HRMS观察到的结构形式与其在数据库中的相关形式相链接,将能够在单个查询中返回结构的所有相关变体以及相关元数据。已开发出一种康斯坦茨信息挖掘器(KNIME)工作流程,以生成使用HRMS观察到的结构表示形式(“质谱就绪结构”),并将其与数据库中存储的结构表示形式相链接。这些质谱就绪结构以及与完整化学表示形式的相关映射,可通过美国环境保护局的化学仪表板(https://comptox.epa.gov/dashboard/ )获取。本文描述了生成和链接约70万个质谱就绪结构(源自约76万个原始结构)的工作流程,以及用于使用HRMS进行结构识别的下载、搜索和导出功能。通过几个示例展示了这种结构表示形式对HRMS的重要性,包括与计算机辅助碎片化软件应用程序MetFrag的集成。结构、搜索、下载和导出功能均可通过CompTox化学仪表板获得,而MetFrag的实现可在https://msbi.ipb-halle.de/MetFragBeta/ 查看。