Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria; Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, 1090 Vienna, Austria.
BASF SE, 67056 Ludwigshafen am Rhein, Germany.
Sci Total Environ. 2023 Oct 15;895:165039. doi: 10.1016/j.scitotenv.2023.165039. Epub 2023 Jun 22.
Today, computational tools for the prediction of the metabolite structures of xenobiotics are widely available and employed in small-molecule research. Reflecting the availability of measured data, these in silico tools are trained and validated primarily on drug metabolism data. In this work, we assessed the capacity of five leading metabolite structure predictors to represent the metabolism of agrochemicals observed in rats. More specifically, we tested the ability of SyGMa, GLORY, GLORYx, BioTransformer 3.0, and MetaTrans to correctly predict and rank the experimentally observed metabolites of a set of 85 parent compounds. We found that the models were able to recover about one to two-thirds of the experimentally observed first-generation, second-generation and third-generation metabolites, confirming their value in applications such as metabolite identification. However, precision was low for all investigated tools and did not exceed approximately 18 % for the pool of first-generation metabolites and 2 % for the pool of compounds representing the first three generations of metabolites. The variance in prediction success rates was high across the individual metabolic maps, meaning that outcomes depend strongly on the specific compound under investigation. We also found that the predictions for individual parent compounds differed strongly between the tools, particularly between those built on orthogonal technologies (e.g., rule-based and end-to-end machine learning approaches). This renders ensemble model strategies promising for improving success rates. Overall, the results of this benchmark study show that there is still considerable room for the improvement of metabolite structure predictors left. Our discussion points out several avenues to progress. The bottleneck in method development certainly has been, and will remain, for the foreseeable future, the limited quantity and quality of available measured data on small-molecule metabolism.
如今,用于预测外源化学物代谢物结构的计算工具已经广泛应用于小分子研究领域。反映出实测数据的可用性,这些计算工具主要基于药物代谢数据进行训练和验证。在这项工作中,我们评估了五个领先的代谢物结构预测器代表大鼠体内观察到的农药代谢的能力。更具体地说,我们测试了 SyGMa、GLORY、GLORYx、BioTransformer 3.0 和 MetaTrans 正确预测和排列 85 种母体化合物的实验观察到的代谢物的能力。我们发现,这些模型能够恢复约三分之一到三分之二的实验观察到的第一代、第二代和第三代代谢物,证实了它们在代谢物鉴定等应用中的价值。然而,所有被调查的工具的精度都很低,对于第一代代谢物的总体来说,精度不超过约 18%,对于代表前三代代谢物的化合物总体来说,精度不超过约 2%。在个体代谢图谱中,预测成功率的方差很高,这意味着结果很大程度上取决于具体的被调查化合物。我们还发现,各个母体化合物的预测结果在工具之间存在很大差异,特别是在基于正交技术(例如基于规则和端到端机器学习方法)的工具之间。这使得集成模型策略有望提高成功率。总体而言,这项基准研究的结果表明,代谢物结构预测器仍有很大的改进空间。我们的讨论指出了几个前进的方向。在可预见的未来,方法开发的瓶颈肯定而且将仍然是可用的小分子代谢实测数据的数量和质量有限。