• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器学习方法评估化合物的综合可及性。

Assessing synthetic accessibility of chemical compounds using machine learning methods.

机构信息

Department of Computer Science and Computer Engineering, University of Minnesota, Minneapolis, Minnesota 55455, USA.

出版信息

J Chem Inf Model. 2010 Jun 28;50(6):979-91. doi: 10.1021/ci900301v.

DOI:10.1021/ci900301v
PMID:20536191
Abstract

With de novo rational drug design, scientists can rapidly generate a very large number of potentially biologically active probes. However, many of them may be synthetically infeasible and, therefore, of limited value to drug developers. On the other hand, most of the tools for synthetic accessibility evaluation are very slow and can process only a few molecules per minute. In this study, we present two approaches to quickly predict the synthetic accessibility of chemical compounds by utilizing support vector machines operating on molecular descriptors. The first approach, RSsvm, is designed to identify the compounds that can be synthesized using a specific set of reactions and starting materials and builds its model by training on the compounds identified as synthetically accessible or not by retrosynthetic analysis. The second approach, DRsvm, is designed to provide a more general assessment of synthetic accessibility that is not tied to any set of reactions or starting materials. The training set compounds for this approach are selected from a diverse library based on the number of other similar compounds within the same library. Both approaches have been shown to perform very well in their corresponding areas of applicability with the RSsvm achieving a receiver operator characteristic score of 0.952 in cross-validation experiments and the DRsvm achieving a score of 0.888 on an independent set of compounds. Our implementations can successfully process thousands of compounds per minute.

摘要

通过从头合理药物设计,科学家可以快速生成大量潜在的具有生物活性的探针。然而,其中许多探针可能在合成上不可行,因此对药物开发者的价值有限。另一方面,大多数用于合成可及性评估的工具都非常缓慢,每分钟只能处理几个分子。在这项研究中,我们提出了两种利用支持向量机(SVM)操作分子描述符快速预测化合物合成可及性的方法。第一种方法 RSsvm 旨在识别可以使用特定反应集和起始原料合成的化合物,并通过对通过反合成分析确定为可合成或不可合成的化合物进行训练来构建其模型。第二种方法 DRsvm 旨在提供一种更通用的合成可及性评估,而不依赖于任何反应集或起始原料。该方法的训练集化合物是从基于同一库中其他类似化合物数量的多样化库中选择的。两种方法在其相应的应用领域都表现得非常出色,RSsvm 在交叉验证实验中获得了 0.952 的接收者操作特征(ROC)评分,DRsvm 在独立化合物集上获得了 0.888 的评分。我们的实现可以成功地每分钟处理数千个化合物。

相似文献

1
Assessing synthetic accessibility of chemical compounds using machine learning methods.利用机器学习方法评估化合物的综合可及性。
J Chem Inf Model. 2010 Jun 28;50(6):979-91. doi: 10.1021/ci900301v.
2
Heteroaromatic rings of the future.未来的杂芳环。
J Med Chem. 2009 May 14;52(9):2952-63. doi: 10.1021/jm801513z.
3
LEAP into the Pfizer Global Virtual Library (PGVL) space: creation of readily synthesizable design ideas automatically.跃入辉瑞全球虚拟图书馆(PGVL)空间:自动生成易于合成的设计理念。
Methods Mol Biol. 2011;685:253-76. doi: 10.1007/978-1-60761-931-4_13.
4
Kinase inhibitor data modeling and de novo inhibitor design with fragment approaches.激酶抑制剂数据建模与基于片段方法的全新抑制剂设计
J Med Chem. 2009 Oct 22;52(20):6456-66. doi: 10.1021/jm901147e.
5
Identification of small molecule aggregators from large compound libraries by support vector machines.通过支持向量机从大型化合物库中鉴定小分子聚集物。
J Comput Chem. 2010 Mar;31(4):752-63. doi: 10.1002/jcc.21347.
6
Classifying 'drug-likeness' with kernel-based learning methods.使用基于核的学习方法对“类药性”进行分类。
J Chem Inf Model. 2005 Mar-Apr;45(2):249-53. doi: 10.1021/ci049737o.
7
Predicting human liver microsomal stability with machine learning techniques.运用机器学习技术预测人肝微粒体稳定性。
J Mol Graph Model. 2008 Feb;26(6):907-15. doi: 10.1016/j.jmgm.2007.06.005. Epub 2007 Jun 27.
8
Is chemical synthetic accessibility computationally predictable for drug and lead-like molecules? A comparative assessment between medicinal and computational chemists.化学合成可及性对于药物和类先导分子是否可计算预测?药物化学家和计算化学家之间的比较评估。
Eur J Med Chem. 2012 Aug;54:679-89. doi: 10.1016/j.ejmech.2012.06.024. Epub 2012 Jun 21.
9
New directions in library design and analysis.图书馆设计与分析的新方向。
Curr Opin Chem Biol. 2008 Jun;12(3):372-8. doi: 10.1016/j.cbpa.2008.02.015. Epub 2008 Apr 18.
10
Machine learning models for lipophilicity and their domain of applicability.用于亲脂性的机器学习模型及其适用范围。
Mol Pharm. 2007 Jul-Aug;4(4):524-38. doi: 10.1021/mp0700413. Epub 2007 Jul 19.

引用本文的文献

1
ML-based Models as a Strategy to Discover Novel Antiepileptic Drugs Targeting Sodium Receptor Channel.基于机器学习的模型作为发现靶向钠受体通道的新型抗癫痫药物的策略。
Curr Top Med Chem. 2025;25(2):209-227. doi: 10.2174/0115680266331755241008061915.
2
The Histone Deacetylase Family: Structural Features and Application of Combined Computational Methods.组蛋白去乙酰化酶家族:结构特征及联合计算方法的应用
Pharmaceuticals (Basel). 2024 May 10;17(5):620. doi: 10.3390/ph17050620.
3
Machine learning for a sustainable energy future.面向可持续能源未来的机器学习。
Nat Rev Mater. 2023;8(3):202-215. doi: 10.1038/s41578-022-00490-5. Epub 2022 Oct 18.
4
An automatic pipeline for the design of irreversible derivatives identifies a potent SARS-CoV-2 M inhibitor.一种用于设计不可逆衍生物的自动化流水线鉴定出一种有效的 SARS-CoV-2 M 抑制剂。
Cell Chem Biol. 2021 Dec 16;28(12):1795-1806.e5. doi: 10.1016/j.chembiol.2021.05.018. Epub 2021 Jun 25.
5
SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules.SAVI,通过专家系统类型规则在计算机中生成数十亿种易于合成的化合物。
Sci Data. 2020 Nov 11;7(1):384. doi: 10.1038/s41597-020-00727-4.
6
Strategies to Support Fragment-to-Lead Optimization in Drug Discovery.支持药物发现中片段到先导物优化的策略。
Front Chem. 2020 Feb 18;8:93. doi: 10.3389/fchem.2020.00093. eCollection 2020.
7
Nonpher: computational method for design of hard-to-synthesize structures.Nonpher:用于设计难以合成结构的计算方法。
J Cheminform. 2017 Mar 20;9(1):20. doi: 10.1186/s13321-017-0206-2.
8
Neural Networks for the Prediction of Organic Chemistry Reactions.用于预测有机化学反应的神经网络。
ACS Cent Sci. 2016 Oct 26;2(10):725-732. doi: 10.1021/acscentsci.6b00219. Epub 2016 Oct 14.
9
Miscellaneous Topics in Computer-Aided Drug Design: Synthetic Accessibility and GPU Computing, and Other Topics.计算机辅助药物设计中的杂项主题:合成可及性与GPU计算及其他主题
Curr Pharm Des. 2016;22(23):3555-68. doi: 10.2174/1381612822666160414142547.
10
SCRIPDB: a portal for easy access to syntheses, chemicals and reactions in patents.SCRIPDB:一个方便获取专利中合成物、化学品和反应信息的门户。
Nucleic Acids Res. 2012 Jan;40(Database issue):D428-33. doi: 10.1093/nar/gkr919. Epub 2011 Nov 8.