Kreutter David, Reymond Jean-Louis
Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern Freiestrasse 3 3012 Bern Switzerland
Chem Sci. 2024 Oct 8;15(43):18031-47. doi: 10.1039/d4sc02408g.
Integrating enzymatic reactions into computer-aided synthesis planning (CASP) should help devise more selective, economical, and greener synthetic routes. Herein we report the triple-transformer loop algorithm with biocatalysis (TTLAB) as a new CASP tool for chemo-enzymatic multistep retrosynthesis. Single-step retrosyntheses are performed using two triple transformer loops (TTL), one trained with chemical reactions from the US Patent Office (USPTO-TTL), the second one obtained by multitask transfer learning combining the USPTO dataset with preparative biotransformations from the literature (ENZR-TTL). Each TTL performs single-step retrosynthesis independently by tagging potential reactive sites in the product, predicting for each site possible starting materials (T1) and reagents or enzymes (T2), and validating the predictions a forward transformer (T3). TTLAB combines predictions from both TTLs to explore multistep sequences using a heuristic best-first tree search and propose short routes from commercial building blocks including enantioselective biocatalytic steps. TTLAB can be used to assist chemoenzymatic route design.
将酶促反应整合到计算机辅助合成规划(CASP)中应有助于设计出更具选择性、经济性和环保性的合成路线。在此,我们报告了带有生物催化的三变压器循环算法(TTLAB),作为一种用于化学酶多步逆合成的新型CASP工具。单步逆合成使用两个三变压器循环(TTL)进行,一个使用美国专利局的化学反应进行训练(USPTO-TTL),另一个通过多任务迁移学习将USPTO数据集与文献中的制备性生物转化相结合获得(ENZR-TTL)。每个TTL通过标记产物中的潜在反应位点、预测每个位点可能的起始原料(T1)和试剂或酶(T2)以及使用正向变压器(T3)验证预测结果来独立进行单步逆合成。TTLAB结合两个TTL的预测结果,使用启发式最佳优先树搜索来探索多步序列,并从包括对映选择性生物催化步骤的商业构建模块中提出短路线。TTLAB可用于辅助化学酶路线设计。