Yang Chi, Hsiao Yung-Chin, Lee Chi-Ching, Yu Jau-Song
Molecular Medicine Research Center, Chang Gung University, Taoyuan 33302, Taiwan.
Graduate Institute of Biomedical Sciences, College of Medicine, Chang Gung University, Taoyuan 33302, Taiwan.
Anal Chem. 2024 Feb 9;96(7):2849-56. doi: 10.1021/acs.analchem.3c03686.
Targeted mass spectrometry is a powerful technique for quantifying specific proteins or metabolites in complex biological samples. Accurate peak picking is a critical step as it determines the absolute abundance of each analyte by integrating the area under the picked peaks. Although automated software exists for handling such complex tasks, manual intervention is often required to rectify potential errors like misclassification or mis-picking events, which can significantly affect quantification accuracy. Therefore, it is necessary to develop objective scoring functions to evaluate peak-picking results and to identify problematic cases for further inspection. In this study, we present targeted mass spectrometry quality encoder (TMSQE), a data-driven scoring function that summarizes peak quality in three types: transition level, peak group level, and consistency level across samples. Through unsupervised learning from large data sets containing 1,703,827 peak groups, TMSQE establishes a reliable standard for systematic and objective evaluations of chromatographic peak quality in targeted mass spectrometry. TMSQE shows a high degree of consistency with expert experiences and can efficiently capture problematic cases after the automated software. Furthermore, we demonstrate the generalizability of TMSQE by successfully applying it to various data sets, including both peptide and metabolite data sets. Our proposed scoring approach provides a reliable solution for consistent and accurate peak quality evaluation, facilitating peak quality control for targeted mass spectrometry.
靶向质谱分析是一种用于定量复杂生物样品中特定蛋白质或代谢物的强大技术。准确的峰提取是关键步骤,因为它通过对提取峰下的面积进行积分来确定每种分析物的绝对丰度。尽管存在用于处理此类复杂任务的自动化软件,但通常仍需要人工干预来纠正潜在错误,如错误分类或误提取事件,这些错误会显著影响定量准确性。因此,有必要开发客观的评分函数来评估峰提取结果,并识别有问题的情况以供进一步检查。在本研究中,我们提出了靶向质谱质量编码器(TMSQE),这是一种数据驱动的评分函数,它从三个类型总结峰质量:跃迁水平、峰组水平和跨样本的一致性水平。通过对包含1,703,827个峰组的大数据集进行无监督学习,TMSQE建立了一个可靠的标准,用于系统、客观地评估靶向质谱分析中的色谱峰质量。TMSQE与专家经验高度一致,并且能够在自动化软件之后有效地捕获有问题的情况。此外,我们通过将TMSQE成功应用于各种数据集(包括肽和代谢物数据集),证明了它的通用性。我们提出的评分方法为一致、准确的峰质量评估提供了可靠的解决方案,有助于靶向质谱分析的峰质量控制。