Domschke Nico, Schmidt Bruno J, Gatter Thomas, Golnik Richard, Eisenhuth Paul, Liessmann Fabian, Meiler Jens, Stadler Peter F
Bioinformatics Group, Department of Computer Science, Leipzig University, Härtelstraße 16-18 , 04107, Leipzig, Germany.
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany.
J Cheminform. 2025 Jun 17;17(1):97. doi: 10.1186/s13321-025-00958-w.
Genetic algorithms are a powerful method to solve optimization problems with complex cost functions over vast search spaces that rely in particular on recombining parts of previous solutions. Crossover operators play a crucial role in this context. Here, we describe a large class of these operators designed for searching over spaces of graphs. These operators are based on introducing small cuts into graphs and rejoining the resulting induced subgraphs of two parents. This form of cut-and-join crossover can be restricted in a consistent way to preserve local properties such as vertex-degrees (valency), or bond-orders, as well as global properties such as graph-theoretic planarity. In contrast to crossover on strings, cut-and-join crossover on graphs is powerful enough to ergodically explore chemical space even in the absence of mutation operators. Extensive benchmarking shows that the offspring of molecular graphs are again plausible molecules with high probability, while at the same time crossover drastically increases the diversity compared to initial molecule libraries. Moreover, desirable properties such as favorable indices of synthesizability are preserved with sufficient frequency that candidate offsprings can be filtered efficiently for such properties. As an application we utilized the cut-and-join crossover in REvoLd, a GA-based system for computer-aided drug design. In optimization runs searching for ligands binding to four different target proteins we consistently found candidate molecules with binding constants exceeding the best known binders as well as candidates found in make-on-demand libraries.Scientific contributionWe define cut-and-join crossover operators on a variety of graph classes including molecular graphs. This constitutes a mathematically simple and well-characterized approach to recombination of molecules that performed very well in real-life CADD tasks.
遗传算法是一种强大的方法,用于解决在广阔搜索空间中具有复杂代价函数的优化问题,该方法特别依赖于重新组合先前解决方案的各个部分。在这种情况下,交叉算子起着至关重要的作用。在这里,我们描述了一大类为在图空间中搜索而设计的此类算子。这些算子基于在图中引入小的割,并重新连接两个亲本产生的诱导子图。这种割接交叉形式可以以一致的方式进行限制,以保留局部属性,如顶点度(价)或键序,以及全局属性,如图论平面性。与字符串上的交叉不同,图上的割接交叉即使在没有变异算子的情况下也足以遍历性地探索化学空间。广泛的基准测试表明,分子图的后代很可能再次是合理可行的分子,同时与初始分子库相比,交叉极大地增加了多样性。此外,诸如良好的可合成性指标等理想属性被足够频繁地保留下来,以至于可以有效地筛选候选后代的此类属性。作为一个应用,我们在REvoLd中使用了割接交叉,REvoLd是一个基于遗传算法的计算机辅助药物设计系统。在寻找与四种不同靶蛋白结合的配体的优化运行中,我们一致发现结合常数超过最知名结合剂的候选分子,以及在按需生成库中发现的候选分子。
科学贡献
我们在包括分子图在内的各种图类上定义了割接交叉算子。这构成了一种数学上简单且特征明确的分子重组方法,在实际的计算机辅助药物设计任务中表现非常出色。