State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, West China Medical School, Sichuan University, Sichuan 610041, China.
J Chem Inf Model. 2011 Oct 24;51(10):2768-77. doi: 10.1021/ci100216g. Epub 2011 Sep 21.
In this account, a rapid retrosynthesis-based scoring method for the assessment of synthetic accessibility of drug-like molecules, called RASA (Retrosynthesis-based Assessment of Synthetic Accessibility) is devised. RASA first constructs a synthesis tree for the target molecule based on retrosynthetic analysis; in this process a series of strategies are suggested for limiting combinatorial explosion of the synthesis tree. A scoring function (RASA-score) for the assessment of synthetic accessibility is then proposed based on the optional effective synthetic routes, the complexity of reaction, and the difficulty of separation/purification associated with the most favorable synthetic route. The contributions of individual components are calibrated by linear regression analysis based on the synthetic accessibility estimates of a training set (100 compounds) given by a group of medicinal chemists (G1). Two external test sets (TS1 and TS2), whose synthetic accessibility estimates were given by the group G1 medicinal chemists and another group (G2) of medicinal chemists (from literature), respectively, were adopted for the evaluation of RASA. The correlation coefficient between the calculated RASA-score values and the estimated scores by medicinal chemists for TS1 is 0.807 and that for TS2 is 0.792, which demonstrate the validity and reliability of RASA. The validity and reliability as well as the high speed of RASA and its capability of suggesting synthetic routes enable it a useful tool in drug discovery.
在此文中,我们设计了一种基于快速反合成分析的药物分子合成可及性评估的评分方法,称为 RASA(基于反合成分析的合成可及性评估)。RASA 首先根据反合成分析构建目标分子的合成树;在此过程中,我们提出了一系列策略来限制合成树的组合爆炸。然后,根据可选的有效合成路线、反应的复杂性以及与最有利的合成路线相关的分离/纯化的难度,提出了一种用于评估合成可及性的评分函数(RASA 评分)。通过线性回归分析对各个组件的贡献进行校准,该分析基于一组药物化学家(G1)对训练集(100 个化合物)的合成可及性估计。我们采用了两个外部测试集(TS1 和 TS2),它们的合成可及性估计值分别由 G1 组药物化学家和另一组(G2)药物化学家(来自文献)给出,用于评估 RASA。TS1 的计算 RASA 评分值与药物化学家估计得分之间的相关系数为 0.807,TS2 的相关系数为 0.792,这表明了 RASA 的有效性和可靠性。RASA 的有效性和可靠性、速度以及其提出合成路线的能力使其成为药物发现的有用工具。