Suppr超能文献

基于 CFM-ID 算法和 ENTACT 混合样品的未知物鉴定的计算串联质谱(MS/MS)谱图的批判性研究。

In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples.

机构信息

Oak Ridge Institute for Science and Education (ORISE) Participant, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27711, USA.

Student Contractor, U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, 109 T.W. Alexander Drive, Research Triangle Park, NC, 27711, USA.

出版信息

Anal Bioanal Chem. 2020 Feb;412(6):1303-1315. doi: 10.1007/s00216-019-02351-7. Epub 2020 Jan 22.

Abstract

High-resolution mass spectrometry (HRMS) enables rapid chemical annotation via accurate mass measurements and matching of experimentally derived spectra with reference spectra. Reference libraries are generated from chemical standards and are therefore limited in size relative to known chemical space. To address this limitation, in silico spectra (i.e., MS/MS or MS2 spectra), predicted via Competitive Fragmentation Modeling-ID (CFM-ID) algorithms, were generated for compounds within the U.S. Environmental Protection Agency's (EPA) Distributed Structure-Searchable Toxicity (DSSTox) database (totaling, at the time of analysis, ~ 765,000 substances). Experimental spectra from EPA's Non-Targeted Analysis Collaborative Trial (ENTACT) mixtures (n = 10) were then used to evaluate the performance of the in silico spectra. Overall, MS2 spectra were acquired for 377 unique compounds from the ENTACT mixtures. Approximately 53% of these compounds were correctly identified using a commercial reference library, whereas up to 50% were correctly identified as the top hit using the in silico library. Together, the reference and in silico libraries were able to correctly identify 73% of the 377 ENTACT substances. When using the in silico spectra for candidate filtering, an examination of binary classifiers showed a true positive rate (TPR) of 0.90 associated with false positive rates (FPRs) of 0.10 to 0.85, depending on the sample and method of candidate filtering. Taken together, these findings show the abilities of in silico spectra to correctly identify true positives in complex samples (at rates comparable to those observed with reference spectra), and efficiently filter large numbers of potential false positives from further consideration. Graphical abstract.

摘要

高分辨率质谱(HRMS)可通过精确质量测量和将实验衍生光谱与参考光谱进行匹配,实现快速的化学注释。参考库是由化学标准品生成的,因此相对于已知的化学空间而言,其规模有限。为了解决这一限制,通过竞争性碎片建模-ID(CFM-ID)算法,针对美国环境保护署(EPA)分布结构可搜索毒性(DSSTox)数据库中的化合物,生成了虚拟光谱(即 MS/MS 或 MS2 光谱)(在分析时,总计约有 765,000 种物质)。然后,使用 EPA 的非靶向分析协作试验(ENTACT)混合物的实验光谱(n=10)来评估虚拟光谱的性能。总体而言,从 ENTACT 混合物中获得了 377 种独特化合物的 MS2 光谱。使用商业参考库,大约有 53%的化合物可以正确识别,而使用虚拟库,最多有 50%的化合物可以正确识别为最佳命中。参考库和虚拟库一起能够正确识别 377 种 ENTACT 物质中的 73%。当使用虚拟光谱进行候选物过滤时,对二项分类器的检查表明,与假阳性率(FPR)为 0.10 至 0.85 相关的真阳性率(TPR)为 0.90,这取决于样品和候选物过滤的方法。总之,这些发现表明,虚拟光谱具有在复杂样品中正确识别阳性结果的能力(与使用参考光谱观察到的速率相当),并且可以有效地从进一步考虑中过滤大量潜在的假阳性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ae6/7021669/105ea83214d9/216_2019_2351_Figa_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验