Suppr超能文献

命中德克斯特:一个用于预测高频击球手的机器学习模型。

Hit Dexter: A Machine-Learning Model for the Prediction of Frequent Hitters.

机构信息

Center for Bioinformatics, Universität Hamburg, Bundesstraße 43, 20146, Hamburg, Germany.

National Infrastructure for Chemical Biology, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, 166 28, Prague 6, Czech Republic.

出版信息

ChemMedChem. 2018 Mar 20;13(6):564-571. doi: 10.1002/cmdc.201700673. Epub 2018 Feb 1.

Abstract

False-positive assay readouts caused by badly behaving compounds-frequent hitters, pan-assay interference compounds (PAINS), aggregators, and others-continue to pose a major challenge to experimental screening. There are only a few in silico methods that allow the prediction of such problematic compounds. We report the development of Hit Dexter, two extremely randomized trees classifiers for the prediction of compounds likely to trigger positive assay readouts either by true promiscuity or by assay interference. The models were trained on a well-prepared dataset extracted from the PubChem Bioassay database, consisting of approximately 311 000 compounds tested for activity on at least 50 proteins. Hit Dexter reached MCC and AUC values of up to 0.67 and 0.96 on an independent test set, respectively. The models are expected to be of high value, in particular to medicinal chemists and biochemists who can use Hit Dexter to identify compounds for which extra caution should be exercised with positive assay readouts. Hit Dexter is available as a free web service at http://hitdexter.zbh. uni-hamburg.de.

摘要

假阳性检测结果是由行为异常的化合物引起的,包括频繁出现的化合物、泛分析干扰化合物(PAINS)、聚集剂等,这些化合物仍然是实验筛选的主要挑战。目前只有少数基于计算的方法可以预测这类有问题的化合物。我们报告了 Hit Dexter 的开发,这是两种用于预测化合物的极端随机树分类器,这些化合物可能由于真正的混杂性或检测干扰而导致阳性检测结果。这些模型是在从 PubChem Bioassay 数据库中提取的精心准备的数据集上进行训练的,该数据集包含大约 311000 种在至少 50 种蛋白质上进行活性测试的化合物。在一个独立的测试集中,Hit Dexter 的 MCC 和 AUC 值分别达到了 0.67 和 0.96。这些模型预计具有很高的价值,特别是对于药用化学家来说,他们可以使用 Hit Dexter 来识别那些在阳性检测结果方面需要格外小心的化合物。Hit Dexter 可在 http://hitdexter.zbh.uni-hamburg.de 上作为免费的网络服务使用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验