Research Institute of Brewing and Malting, Plc., Lípová 511/15, 120 44, Prague 2, Czech Republic.
Research Institute of Brewing and Malting, Plc., Lípová 511/15, 120 44, Prague 2, Czech Republic; Charles University, Faculty of Science, Department of Analytical Chemistry, Albertov 6, 128 43, Prague 2, Czech Republic.
Anal Chim Acta. 2021 Feb 22;1147:64-71. doi: 10.1016/j.aca.2020.12.043. Epub 2020 Dec 29.
Retention index in gas chromatographic analyses is an essential tool for appropriate analyte identification. Currently, many libraries providing retention indices for a huge number of compounds on distinct stationary phase chemistries are available. However, situation could be complicated in the case of unknown unknowns not present in such libraries. The importance of identification of these compounds have risen together with a rapidly expanding interest in non-targeted analyses in the last decade. Therefore, precise in silico computation/prediction of retention indices based on a suggested molecular structure will be highly appreciated in such situations. On this basis, a predictive model based on deep learning was developed and presented in this paper. It is designed for user-friendly and accurate prediction of retention indices of compounds in gas chromatography with the semi-standard non-polar stationary phase. Simplified Molecular Input Entry System (SMILES) is used as the model's input. Architecture of the model consists of 2D-convolutional layers, together with batch normalization, max pooling, dropout, and three residual connections. The model reaches median absolute error of prediction of the retention index for validation and test set at 16.4 and 16.0 units, respectively. Median percentage error is lower than or equal to 0.81% in the case of all mentioned data sets. Finally, the DeepReI model is presented in R package, and is available on https://github.com/TomasVrzal/DeepReI together with a user-friendly graphical user interface.
在气相色谱分析中,保留指数是正确鉴定分析物的重要工具。目前,有许多库提供了不同固定相化学物质上大量化合物的保留指数,但在这些库中没有出现的未知未知物的情况下,情况可能会变得复杂。这些化合物的鉴定重要性随着非靶向分析在过去十年中兴趣的迅速增加而增加。因此,在这种情况下,基于建议的分子结构进行精确的计算/预测保留指数将受到高度赞赏。在此基础上,本文提出并介绍了一种基于深度学习的预测模型。它旨在为用户提供友好、准确的预测气相色谱中化合物保留指数的服务,采用半标准非极性固定相。模型的输入为简化分子输入系统(SMILES)。模型的架构由二维卷积层组成,结合批量归一化、最大池化、随机失活和三个残差连接。该模型对验证集和测试集的保留指数预测中位数绝对误差分别为 16.4 和 16.0 个单位。在所有提到的数据集的情况下,中位数百分比误差均低于或等于 0.81%。最后,DeepReI 模型在 R 包中呈现,并在 https://github.com/TomasVrzal/DeepReI 上提供,同时还提供了一个用户友好的图形用户界面。