Suppr超能文献

采用氨基酸标签的随机森林算法对肽段飞行时间二次离子质谱图谱的评估:来自于一个关于先进材料和标准的凡尔赛项目的实验室间研究结果。

Evaluation of Time-of-Flight Secondary Ion Mass Spectrometry Spectra of Peptides by Random Forest with Amino Acid Labels: Results from a Versailles Project on Advanced Materials and Standards Interlaboratory Study.

机构信息

Faculty of Science and Technology, Seikei University, Musashino, Tokyo 180-8633, Japan.

National Metrology Institute of Japan (NMIJ), National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono, Tsukuba, Ibaraki 305-8568, Japan.

出版信息

Anal Chem. 2021 Mar 9;93(9):4191-4197. doi: 10.1021/acs.analchem.0c04577. Epub 2021 Feb 26.

Abstract

We report the results of a VAMAS (Versailles Project on Advanced Materials and Standards) interlaboratory study on the identification of peptide sample TOF-SIMS spectra by machine learning. More than 1000 time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of six peptide model samples (one of them was a test sample) were collected using 27 TOF-SIMS instruments from 25 institutes of six countries, the U. S., the U. K., Germany, China, South Korea, and Japan. Because peptides have systematic and simple chemical structures, they were selected as model samples. The intensity of peaks in every TOF-SIMS spectrum was extracted using the same peak list and normalized to the total ion count. The spectra of the test peptide sample were predicted by Random Forest with 20 amino acid labels. The accuracy of the prediction for the test spectra was 0.88. Although the prediction of an unknown peptide was not perfect, it was shown that all of the amino acids in an unknown peptide can be determined by Random Forest prediction and the TOF-SIMS spectra. Moreover, the prediction of peptides, which are included in the training spectra, was almost perfect. Random Forest also suggests specific fragment ions from an amino acid residue Q, whose fragment ions detected by TOF-SIMS have not been reported, in the important features. This study indicated that the analysis using Random Forest, which enables translation of the mathematical relationships to chemical relationships, and the multi labels representing monomer chemical structures, is useful to predict the TOF-SIMS spectra of an unknown peptide.

摘要

我们报告了一个 VAMAS(先进材料和标准凡尔赛项目)关于通过机器学习识别肽样品 TOF-SIMS 谱的实验室间研究的结果。使用来自六个国家的 25 个研究所的 27 个 TOF-SIMS 仪器,收集了六个肽模型样品(其中一个是测试样品)的超过 1000 个飞行时间二次离子质谱(TOF-SIMS)谱。由于肽具有系统且简单的化学结构,因此它们被选为模型样品。使用相同的峰列表提取每个 TOF-SIMS 谱中的峰强度,并将其归一化为总离子计数。使用 20 种氨基酸标签的随机森林对测试肽样品的光谱进行预测。测试光谱的预测准确率为 0.88。尽管对未知肽的预测并不完美,但结果表明,随机森林预测和 TOF-SIMS 谱可以确定未知肽中的所有氨基酸。此外,对训练光谱中包含的肽的预测几乎是完美的。随机森林还提示了氨基酸残基 Q 的特定片段离子,其 TOF-SIMS 检测到的片段离子尚未报道,这是重要特征。这项研究表明,使用随机森林进行分析,将数学关系转化为化学关系,以及代表单体化学结构的多标签,对于预测未知肽的 TOF-SIMS 谱非常有用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验