命中德克斯特：一个用于预测高频击球手的机器学习模型。

Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.

机构信息

Center for Bioinformatics, Universität Hamburg, Bundesstraße 43, 20146, Hamburg, Germany.

National Infrastructure for Chemical Biology, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, 166 28, Prague 6, Czech Republic.

出版信息

ChemMedChem. 2018 Mar 20;13(6):564-571. doi: 10.1002/cmdc.201700673. Epub 2018 Feb 1.

DOI:10.1002/cmdc.201700673

PMID:29285887

Abstract

False-positive assay readouts caused by badly behaving compounds-frequent hitters, pan-assay interference compounds (PAINS), aggregators, and others-continue to pose a major challenge to experimental screening. There are only a few in silico methods that allow the prediction of such problematic compounds. We report the development of Hit Dexter, two extremely randomized trees classifiers for the prediction of compounds likely to trigger positive assay readouts either by true promiscuity or by assay interference. The models were trained on a well-prepared dataset extracted from the PubChem Bioassay database, consisting of approximately 311 000 compounds tested for activity on at least 50 proteins. Hit Dexter reached MCC and AUC values of up to 0.67 and 0.96 on an independent test set, respectively. The models are expected to be of high value, in particular to medicinal chemists and biochemists who can use Hit Dexter to identify compounds for which extra caution should be exercised with positive assay readouts. Hit Dexter is available as a free web service at http://hitdexter.zbh. uni-hamburg.de.

摘要

假阳性检测结果是由行为异常的化合物引起的，包括频繁出现的化合物、泛分析干扰化合物（PAINS）、聚集剂等，这些化合物仍然是实验筛选的主要挑战。目前只有少数基于计算的方法可以预测这类有问题的化合物。我们报告了 Hit Dexter 的开发，这是两种用于预测化合物的极端随机树分类器，这些化合物可能由于真正的混杂性或检测干扰而导致阳性检测结果。这些模型是在从 PubChem Bioassay 数据库中提取的精心准备的数据集上进行训练的，该数据集包含大约 311000 种在至少 50 种蛋白质上进行活性测试的化合物。在一个独立的测试集中，Hit Dexter 的 MCC 和 AUC 值分别达到了 0.67 和 0.96。这些模型预计具有很高的价值，特别是对于药用化学家来说，他们可以使用 Hit Dexter 来识别那些在阳性检测结果方面需要格外小心的化合物。Hit Dexter 可在 http://hitdexter.zbh.uni-hamburg.de 上作为免费的网络服务使用。

相似文献

Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.命中德克斯特：一个用于预测高频击球手的机器学习模型。

ChemMedChem. 2018 Mar 20;13(6):564-571. doi: 10.1002/cmdc.201700673. Epub 2018 Feb 1.

Hit Dexter 2.0: Machine-Learning Models for the Prediction of Frequent Hitters.命中德克斯特 2.0：用于预测高频命中者的机器学习模型。

J Chem Inf Model. 2019 Mar 25;59(3):1030-1043. doi: 10.1021/acs.jcim.8b00677. Epub 2019 Jan 25.

PAIN(S) relievers for medicinal chemists: how computational methods can assist in hit evaluation.药物化学家的止痛剂：计算方法如何辅助活性筛选评估

Future Med Chem. 2018 Jul 1;10(13):1533-1535. doi: 10.4155/fmc-2018-0116. Epub 2018 Jun 29.

Modeling Small-Molecule Reactivity Identifies Promiscuous Bioactive Compounds.小分子反应性建模鉴定广谱生物活性化合物。

J Chem Inf Model. 2018 Aug 27;58(8):1483-1500. doi: 10.1021/acs.jcim.8b00104. Epub 2018 Jul 23.

Identification of Compounds That Interfere with High-Throughput Screening Assay Technologies.鉴定干扰高通量筛选技术的化合物。

ChemMedChem. 2019 Oct 17;14(20):1795-1802. doi: 10.1002/cmdc.201900395. Epub 2019 Sep 19.

Nuisance Compounds, PAINS Filters, and Dark Chemical Matter in the GSK HTS Collection.GSK HTS 化合物库中的干扰化合物、PAINS 过滤器和暗化学物质。

SLAS Discov. 2018 Jul;23(6):532-545. doi: 10.1177/2472555218768497. Epub 2018 Apr 26.

Benchmarking the mechanisms of frequent hitters: limitation of PAINS alerts.频繁命中药物靶点的机制基准测试：PAINS 警报的局限性。

Drug Discov Today. 2021 Jun;26(6):1353-1358. doi: 10.1016/j.drudis.2021.02.003. Epub 2021 Feb 10.

ChemFH: an integrated tool for screening frequent false positives in chemical biology and drug discovery.ChemFH：一种用于筛选化学生物学和药物发现中常见假阳性的综合工具。

Nucleic Acids Res. 2024 Jul 5;52(W1):W439-W449. doi: 10.1093/nar/gkae424.

New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays.新型亚结构筛选器，可用于从筛选库中去除泛分析干扰化合物（PAINS），并在生物测定中对其进行排除。

J Med Chem. 2010 Apr 8;53(7):2719-40. doi: 10.1021/jm901137j.

Identification of small molecule aggregators from large compound libraries by support vector machines.通过支持向量机从大型化合物库中鉴定小分子聚集物。

J Comput Chem. 2010 Mar;31(4):752-63. doi: 10.1002/jcc.21347.

引用本文的文献

Exploring protein inhibitors of through pharmacoinformatic approaches incorporating solubility-enhancing formulation insights.通过结合提高溶解度制剂见解的药物信息学方法探索[具体物质]的蛋白质抑制剂。（注：原文中“Exploring...of...”中间缺少具体所探索的对象，这里补充了“[具体物质]”使句子完整）

Front Pharmacol. 2025 Aug 14;16:1630038. doi: 10.3389/fphar.2025.1630038. eCollection 2025.

Integrated Virtual Screening Approach Identifies New CYP19A1 Inhibitors.综合虚拟筛选方法鉴定出新型CYP19A1抑制剂。

J Chem Inf Model. 2025 Apr 14;65(7):3529-3543. doi: 10.1021/acs.jcim.5c00204. Epub 2025 Mar 19.

Integrating natural product research laboratory with artificial intelligence: Advancements and breakthroughs in traditional medicine.整合天然产物研究实验室与人工智能：传统医学的进展与突破。

Biomedicine (Taipei). 2024 Dec 1;14(4):1-14. doi: 10.37796/2211-8039.1475. eCollection 2024.

AI in Clinical Trials and Drug Development: Challenges and Potential Advancements.人工智能在临床试验和药物研发中的挑战与潜在进展

Curr Drug Discov Technol. 2024 Oct 28. doi: 10.2174/0115701638314252241016165345.

Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery.机器学习辅助药物发现高通量筛选中的活性化合物优先级排序

ACS Cent Sci. 2024 Mar 15;10(4):823-832. doi: 10.1021/acscentsci.3c01517. eCollection 2024 Apr 24.

Tackling assay interference associated with small molecules.解决小分子相关的检测干扰问题。

Nat Rev Chem. 2024 May;8(5):319-339. doi: 10.1038/s41570-024-00593-3. Epub 2024 Apr 15.

Molecular dynamics simulations as a guide for modulating small molecule aggregation.分子动力学模拟引导小分子聚集态调控。

J Comput Aided Mol Des. 2024 Mar 12;38(1):11. doi: 10.1007/s10822-024-00557-1.

Elucidating the Potential Inhibitor against Type 2 Diabetes Mellitus Associated Gene of GLUT4.阐明针对2型糖尿病相关基因GLUT4的潜在抑制剂。

J Pers Med. 2023 Apr 12;13(4):660. doi: 10.3390/jpm13040660.

PubChem 2023 update.PubChem 2023 更新。

Nucleic Acids Res. 2023 Jan 6;51(D1):D1373-D1380. doi: 10.1093/nar/gkac956.

Gains from no real PAINS: Where 'Fair Trial Strategy' stands in the development of multi-target ligands.无实际痛苦的收获：“公平试验策略”在多靶点配体开发中的地位。

Acta Pharm Sin B. 2021 Nov;11(11):3417-3432. doi: 10.1016/j.apsb.2021.02.023. Epub 2021 Mar 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

命中德克斯特：一个用于预测高频击球手的机器学习模型。

Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献