Suppr超能文献

基于气相色谱-质谱联用代谢组学中相似性差异的错误识别发现

Discovery of False Identification Using Similarity Difference in GC-MS based Metabolomics.

作者信息

Kim Seongho, Zhang Xiang

机构信息

Biostatistics Core, Karmanos Cancer Institute, Department of Oncology, Wayne State University, Detroit, MI, 48201, USA.

Department of Chemistry, University of Louisville, Louisville, KY, 40292, USA.

出版信息

J Chemom. 2015 Feb 1;29(2):80-86. doi: 10.1002/cem.2665.

Abstract

Compound identification is a critical process in metabolomics. The widely used approach for compound identification in gas chromatography-mass spectrometry (GC-MS) based metabolomics is the spectrum matching, in which the mass spectral similarity between an experimental mass spectrum and each mass spectrum in a reference library is calculated. While various similarity measures have been developed to improve the overall accuracy of compound identification, little attention has been paid to reducing the false discovery rate. We, therefore, develop an approach for controlling false identification rate using the distribution of the difference between the first and the second highest spectral similarity scores. We further propose a model-based approach to achieving a desired true positive rate. The developed method is applied to the NIST mass spectral library and its performance is compared with the conventional approach that uses only the maximum spectral similarity score. The results show that the developed method achieves a significantly higher 1 score and positive predictive value than those of the conventional approach.

摘要

化合物鉴定是代谢组学中的一个关键过程。在基于气相色谱-质谱联用(GC-MS)的代谢组学中,广泛使用的化合物鉴定方法是谱图匹配,即计算实验质谱与参考库中每个质谱之间的质谱相似性。虽然已经开发了各种相似性度量方法来提高化合物鉴定的整体准确性,但对于降低错误发现率却很少有人关注。因此,我们开发了一种利用第一和第二高谱图相似性分数之间差异的分布来控制错误鉴定率的方法。我们还进一步提出了一种基于模型的方法来实现所需的真阳性率。将所开发的方法应用于美国国家标准与技术研究院(NIST)质谱库,并将其性能与仅使用最高谱图相似性分数的传统方法进行比较。结果表明,所开发的方法比传统方法具有显著更高的F1分数和阳性预测值。

相似文献

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验