Department of Pharmaceutical Chemistry, Philipps-University, Marbacher Weg 6, 35032 Marburg, Germany.
J Chem Inf Model. 2010 Sep 27;50(9):1644-59. doi: 10.1021/ci9003305.
In combinatorial chemistry, molecules are assembled according to combinatorial principles by linking suitable reagents or decorating a given scaffold with appropriate substituents from a large chemical space of starting materials. Often the number of possible combinations greatly exceeds the number feasible to handle by an in-depth in silico approach or even more if it should be experimentally synthesized. Therefore, powerful tools to efficiently enumerate large chemical spaces are required. They can be provided by genetic algorithms, which mimic Darwinian evolution. GARLig (genetic algorithm using reagents to compose ligands) has been developed to perform subset selection in large chemical compound spaces subject to target-specific 3D-scoring criteria. GARLig uses different scoring schemes, such as AutoDock4 Score, GOLDScore, and DrugScore(CSD), as fitness functions. Its genetic parameters have been optimized to characterize combinatorial libraries with respect to the binding to various targets of pharmaceutical interest. A large tripeptide library of 20(3) members has been used to profile amino acid frequencies in putative substrates for trypsin, thrombin, factor Xa, and plasmin. A peptidomimetic scaffold assembled from a selection of a 25(3) building block was used to test the performance of the evolutionary algorithm in suggesting potent inhibitors of the enzyme cathepsin D. In a final case study, our program was used to characterize and rank a combinatorial drug-like library comprising 33,750 potential thrombin inhibitors. These case studies demonstrate that GARLig finds experimentally confirmed potent leads by processing a significantly smaller subset of the fully enumerated combinatorial library. Furthermore, the profiles of amino acids computed by the genetic algorithm match the observed amino acid frequencies found by screening peptide libraries in substrate cleavage assays.
在组合化学中,分子根据组合原理通过连接合适的试剂或用适当的取代基修饰给定的支架,从大量的起始材料的化学空间中组装而成。通常,可能的组合数量大大超过了通过深入的计算机模拟方法或甚至更多的方法来处理的可行数量,如果应该进行实验合成的话。因此,需要有强大的工具来有效地枚举大的化学空间。遗传算法可以提供这些工具,这些算法模拟了达尔文进化。GARLig(使用试剂组合配体的遗传算法)已经被开发出来,用于在大的化学化合物空间中进行子集选择,以满足目标特定的 3D 评分标准。GARLig 使用不同的评分方案,如 AutoDock4 评分、GOLDScore 和 DrugScore(CSD),作为适应度函数。它的遗传参数已经过优化,可以根据对各种药物靶标的结合来描述组合文库。使用 20(3)个成员的大型三肽文库来描述假定的胰蛋白酶、凝血酶、因子 Xa 和纤溶酶的底物中的氨基酸频率。从 25(3)个构建块的选择中组装的肽模拟支架被用于测试进化算法在建议酶 cathepsin D 的有效抑制剂方面的性能。在最后一个案例研究中,我们的程序用于描述和排列一个包含 33750 个潜在凝血酶抑制剂的组合类药物文库。这些案例研究表明,GARLig 通过处理完全枚举的组合文库的一小部分子集,找到了经过实验证实的有效先导化合物。此外,遗传算法计算的氨基酸谱与通过筛选底物切割测定中的肽文库观察到的氨基酸频率相匹配。