Van't Hoff Institute for Molecular Sciences (HIMS), University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands.
Queensland Alliance for Environmental Health Sciences (QAEHS), The University of Queensland, Woolloongabba, Queensland 4102, Australia.
Anal Chem. 2021 Dec 14;93(49):16562-16570. doi: 10.1021/acs.analchem.1c03755. Epub 2021 Nov 29.
Centroiding is one of the major approaches used for size reduction of the data generated by high-resolution mass spectrometry. During centroiding, performed either during acquisition or as a pre-processing step, the mass profiles are represented by a single value (i.e., the centroid). While being effective in reducing the data size, centroiding also reduces the level of information density present in the mass peak profile. Moreover, each step of the centroiding process and their consequences on the final results may not be completely clear. Here, we present Cent2Prof, a package containing two algorithms that enables the conversion of the centroided data to mass peak profile data and vice versa. The centroiding algorithm uses the resolution-based mass peak width parameter as the first guess and self-adjusts to fit the data. In addition to the / values, the centroiding algorithm also generates the measured mass peak widths at half-height, which can be used during the feature detection and identification. The mass peak profile prediction algorithm employs a random-forest model for the prediction of mass peak widths, which is consequently used for mass profile reconstruction. The centroiding results were compared to the outputs of the MZmine-implemented centroiding algorithm. Our algorithm resulted in rates of false detection ≤5% while the MZmine algorithm resulted in 30% rate of false positive and 3% rate of false negative. The error in profile prediction was ≤56% independent of the mass, ionization mode, and intensity, which was 6 times more accurate than the resolution-based estimated values.
质心化是用于减小高分辨率质谱产生的数据量的主要方法之一。在质心化过程中(无论是在采集过程中还是作为预处理步骤),质量轮廓由单个值表示(即质心)。虽然质心化在减小数据量方面非常有效,但它也降低了质量峰轮廓中存在的信息密度水平。此外,质心化过程的每一步及其对最终结果的影响可能并不完全清楚。在这里,我们介绍了 Cent2Prof,这是一个包含两个算法的软件包,可将质心化数据转换为质量峰轮廓数据,反之亦然。质心化算法使用基于分辨率的质量峰宽度参数作为初始猜测,并进行自我调整以拟合数据。除了 / 值之外,质心化算法还生成半高处测量的质量峰宽度,可用于特征检测和识别。质量峰轮廓预测算法使用随机森林模型预测质量峰宽度,随后用于质量轮廓重建。将质心化结果与 MZmine 实现的质心化算法的输出进行比较。我们的算法导致的假阳性率≤5%,而 MZmine 算法导致的假阳性率为 30%,假阴性率为 3%。无论质量、离子化模式和强度如何,轮廓预测的误差均≤56%,比基于分辨率的估计值准确 6 倍。