Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg Pepparedsleden 1, 431 50 Mölndal, Sweden.
Department of Computer Science and Engineering, Chalmers University of Technology Chalmersplatsen 4, 412 96 Gothenburg, Sweden.
J Chem Inf Model. 2022 May 9;62(9):2093-2100. doi: 10.1021/acs.jcim.1c00777. Epub 2021 Nov 10.
Here, we explore the impact of different graph traversal algorithms on molecular graph generation. We do this by training a graph-based deep molecular generative model to build structures using a node order determined via either a breadth- or depth-first search algorithm. What we observe is that using a breadth-first traversal leads to better coverage of training data features compared to a depth-first traversal. We have quantified these differences using a variety of metrics on a data set of natural products. These metrics include percent validity, molecular coverage, and molecular shape. We also observe that by using either a breadth- or depth-first traversal it is possible to overtrain the generative models, at which point the results with either graph traversal algorithm are identical.
在这里,我们探讨了不同图遍历算法对分子图生成的影响。我们通过训练一个基于图的深度分子生成模型来构建结构,该模型使用通过广度优先或深度优先搜索算法确定的节点顺序。我们观察到,与深度优先遍历相比,使用广度优先遍历可以更好地覆盖训练数据特征。我们在天然产物数据集上使用各种指标对这些差异进行了量化。这些指标包括有效百分比、分子覆盖率和分子形状。我们还观察到,无论是使用广度优先遍历还是深度优先遍历,都有可能过度训练生成模型,在这种情况下,两种图遍历算法的结果是相同的。