Department of Computer Science, Western Michigan University, MI, USA.
School of Computing and Information Sciences, Florida International University, Miami, FL, USA.
Proteomics. 2018 Oct;18(20):e1800206. doi: 10.1002/pmic.201800206. Epub 2018 Sep 28.
Mass Spectrometry (MS)-based proteomics has become an essential tool in the study of proteins. With the advent of modern MS machines huge amounts of data is being generated, which can only be processed by novel algorithmic tools. However, in the absence of data benchmarks and ground truth datasets algorithmic integrity testing and reproducibility is a challenging problem. To this end, MaSS-Simulator has been presented, which is an easy to use simulator and can be configured to simulate MS/MS datasets for a wide variety of conditions with known ground truths. MaSS-Simulator offers many configuration options to allow the user a great degree of control over the test datasets, which can enable rigorous and large- scale testing of any proteomics algorithm. MaSS-Simulator is assessed by comparing its performance against experimentally generated spectra and spectra obtained from NIST collections of spectral library. The results show that MaSS-Simulator generated spectra match closely with real-spectra and have a relative-error distribution centered around 25%. In contrast, the theoretical spectra for same peptides have relative-error distribution centered around 150%. MaSS-Simulator will enable developers to specifically highlight the capabilities of their algorithms and provide a strong proof of any pitfalls they might face. Source code, executables, and a user manual for MaSS-Simulator can be downloaded from https://github.com/pcdslab/MaSS-Simulator.
基于质谱(MS)的蛋白质组学已成为研究蛋白质的重要工具。随着现代 MS 仪器的出现,大量数据正在生成,这些数据只能通过新颖的算法工具进行处理。然而,在缺乏数据基准和真实数据集的情况下,算法完整性测试和可重复性是一个具有挑战性的问题。为此,提出了 MaSS-Simulator,它是一个易于使用的模拟器,可以配置为模拟具有已知真实数据集的各种条件下的 MS/MS 数据集。MaSS-Simulator 提供了许多配置选项,允许用户对测试数据集进行高度控制,从而可以对任何蛋白质组学算法进行严格和大规模的测试。通过将其性能与实验生成的光谱和 NIST 光谱库集合中获得的光谱进行比较,对 MaSS-Simulator 进行了评估。结果表明,MaSS-Simulator 生成的光谱与真实光谱非常吻合,其相对误差分布集中在 25%左右。相比之下,相同肽的理论光谱的相对误差分布集中在 150%左右。MaSS-Simulator 将使开发人员能够专门突出其算法的功能,并为他们可能面临的任何缺陷提供有力的证明。MaSS-Simulator 的源代码、可执行文件和用户手册可以从 https://github.com/pcdslab/MaSS-Simulator 下载。