Suppr超能文献

Alignstein:用于改进 LC-MS 保留时间对齐的最优传输。

Alignstein: Optimal transport for improved LC-MS retention time alignment.

机构信息

Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097 Warsaw, Poland.

出版信息

Gigascience. 2022 Nov 3;11. doi: 10.1093/gigascience/giac101.

Abstract

BACKGROUND

Reproducibility of liquid chromatography separation is limited by retention time drift. As a result, measured signals lack correspondence over replicates of the liquid chromatography-mass spectrometry (LC-MS) experiments. Correction of these errors is named retention time alignment and needs to be performed before further quantitative analysis. Despite the availability of numerous alignment algorithms, their accuracy is limited (e.g., for retention time drift that swaps analytes' elution order).

RESULTS

We present the Alignstein, an algorithm for LC-MS retention time alignment. It correctly finds correspondence even for swapped signals. To achieve this, we implemented the generalization of the Wasserstein distance to compare multidimensional features without any reduction of the information or dimension of the analyzed data. Moreover, Alignstein by design requires neither a reference sample nor prior signal identification. We validate the algorithm on publicly available benchmark datasets obtaining competitive results. Finally, we show that it can detect the information contained in the tandem mass spectrum by the spatial properties of chromatograms.

CONCLUSIONS

We show that the use of optimal transport effectively overcomes the limitations of existing algorithms for statistical analysis of mass spectrometry datasets. The algorithm's source code is available at https://github.com/grzsko/Alignstein.

摘要

背景

液相色谱分离的重现性受到保留时间漂移的限制。因此,在液相色谱-质谱(LC-MS)实验的重复实验中,测量信号缺乏对应性。这些错误的校正称为保留时间对齐,需要在进一步进行定量分析之前进行。尽管有许多对齐算法,但它们的准确性有限(例如,对于交换分析物洗脱顺序的保留时间漂移)。

结果

我们提出了一种用于 LC-MS 保留时间对齐的算法,即 Alignstein。即使对于交换的信号,它也能正确找到对应关系。为了实现这一点,我们实现了 Wasserstein 距离的推广,以比较多维特征,而不会对分析数据的信息或维度进行任何减少。此外,Alignstein 设计上既不需要参考样本,也不需要预先的信号识别。我们在公开可用的基准数据集上验证了该算法,获得了有竞争力的结果。最后,我们表明它可以通过色谱图的空间特性来检测串联质谱中包含的信息。

结论

我们表明,最优传输的使用有效地克服了现有算法在质谱数据集统计分析中的局限性。该算法的源代码可在 https://github.com/grzsko/Alignstein 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6665/9633278/3f1cc81ea97c/giac101fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验