Suppr超能文献

命中德克斯特:一个用于预测高频击球手的机器学习模型。

Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.

机构信息

Center for Bioinformatics, Universität Hamburg, Bundesstraße 43, 20146, Hamburg, Germany.

National Infrastructure for Chemical Biology, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, 166 28, Prague 6, Czech Republic.

出版信息

ChemMedChem. 2018 Mar 20;13(6):564-571. doi: 10.1002/cmdc.201700673. Epub 2018 Feb 1.

Abstract

False-positive assay readouts caused by badly behaving compounds-frequent hitters, pan-assay interference compounds (PAINS), aggregators, and others-continue to pose a major challenge to experimental screening. There are only a few in silico methods that allow the prediction of such problematic compounds. We report the development of Hit Dexter, two extremely randomized trees classifiers for the prediction of compounds likely to trigger positive assay readouts either by true promiscuity or by assay interference. The models were trained on a well-prepared dataset extracted from the PubChem Bioassay database, consisting of approximately 311 000 compounds tested for activity on at least 50 proteins. Hit Dexter reached MCC and AUC values of up to 0.67 and 0.96 on an independent test set, respectively. The models are expected to be of high value, in particular to medicinal chemists and biochemists who can use Hit Dexter to identify compounds for which extra caution should be exercised with positive assay readouts. Hit Dexter is available as a free web service at http://hitdexter.zbh. uni-hamburg.de.

摘要

假阳性检测结果是由行为异常的化合物引起的,包括频繁出现的化合物、泛分析干扰化合物(PAINS)、聚集剂等,这些化合物仍然是实验筛选的主要挑战。目前只有少数基于计算的方法可以预测这类有问题的化合物。我们报告了 Hit Dexter 的开发,这是两种用于预测化合物的极端随机树分类器,这些化合物可能由于真正的混杂性或检测干扰而导致阳性检测结果。这些模型是在从 PubChem Bioassay 数据库中提取的精心准备的数据集上进行训练的,该数据集包含大约 311000 种在至少 50 种蛋白质上进行活性测试的化合物。在一个独立的测试集中,Hit Dexter 的 MCC 和 AUC 值分别达到了 0.67 和 0.96。这些模型预计具有很高的价值,特别是对于药用化学家来说,他们可以使用 Hit Dexter 来识别那些在阳性检测结果方面需要格外小心的化合物。Hit Dexter 可在 http://hitdexter.zbh.uni-hamburg.de 上作为免费的网络服务使用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验