Suppr超能文献

用于小分子保留时间预测的深度图卷积网络。

Deep graph convolutional network for small-molecule retention time prediction.

机构信息

School of Engineering, Westlake University, Hangzhou, Zhejiang, 310024, China.

School of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu, 210096, China.

出版信息

J Chromatogr A. 2023 Nov 22;1711:464439. doi: 10.1016/j.chroma.2023.464439. Epub 2023 Oct 13.

Abstract

The retention time (RT) is a crucial source of data for liquid chromatography-mass spectrometry (LCMS). A model that can accurately predict the RT for each molecule would empower filtering candidates with similar spectra but differing RT in LCMS-based molecule identification. Recent research shows that graph neural networks (GNNs) outperform traditional machine learning algorithms in RT prediction. However, all of these models use relatively shallow GNNs. This study for the first time investigates how depth affects GNNs' performance on RT prediction. The results demonstrate that a notable improvement can be achieved by pushing the depth of GNNs to 16 layers by the adoption of residual connection. Additionally, we also find that graph convolutional network (GCN) model benefits from the edge information. The developed deep graph convolutional network, DeepGCN-RT, significantly outperforms the previous state-of-the-art method and achieves the lowest mean absolute percentage error (MAPE) of 3.3% and the lowest mean absolute error (MAE) of 26.55 s on the SMRT test set. We also finetune DeepGCN-RT on seven datasets with various chromatographic conditions. The mean MAE of the seven datasets largely decreases 30% compared to previous state-of-the-art method. On the RIKEN-PlaSMA dataset, we also test the effectiveness of DeepGCN-RT in assisting molecular structure identification. By 30% lessening the number of potential structures, DeepGCN-RT is able to improve top-1 accuracy by about 11%.

摘要

保留时间 (RT) 是液相色谱-质谱 (LCMS) 的重要数据来源。如果有一种模型能够准确预测每个分子的 RT,那么在基于 LCMS 的分子识别中,就可以对具有相似光谱但 RT 不同的候选物进行过滤。最近的研究表明,图神经网络 (GNN) 在 RT 预测方面优于传统的机器学习算法。然而,所有这些模型都使用相对较浅的 GNN。本研究首次探讨了深度如何影响 GNN 在 RT 预测中的性能。结果表明,通过采用残差连接将 GNN 的深度推至 16 层,可以显著提高性能。此外,我们还发现图卷积网络 (GCN) 模型受益于边缘信息。所开发的深度图卷积网络 DeepGCN-RT 显著优于先前的最先进方法,在 SMRT 测试集上实现了最低的平均绝对百分比误差 (MAPE) 3.3%和最低的平均绝对误差 (MAE) 26.55 秒。我们还在具有各种色谱条件的七个数据集上微调了 DeepGCN-RT。与先前的最先进方法相比,这七个数据集的平均 MAE 大大降低了 30%。在 RIKEN-PlaSMA 数据集上,我们还测试了 DeepGCN-RT 在辅助分子结构识别方面的有效性。通过将潜在结构的数量减少 30%,DeepGCN-RT 能够将准确率提高约 11%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验