Suppr超能文献

利用PubChem中的化学结构指纹和高通量筛选数据开发并验证预测性决策树模型。

Developing and validating predictive decision tree models from mining chemical structural fingerprints and high-throughput screening data in PubChem.

作者信息

Han Lianyi, Wang Yanli, Bryant Stephen H

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

出版信息

BMC Bioinformatics. 2008 Sep 25;9:401. doi: 10.1186/1471-2105-9-401.

Abstract

BACKGROUND

Recent advances in high-throughput screening (HTS) techniques and readily available compound libraries generated using combinatorial chemistry or derived from natural products enable the testing of millions of compounds in a matter of days. Due to the amount of information produced by HTS assays, it is a very challenging task to mine the HTS data for potential interest in drug development research. Computational approaches for the analysis of HTS results face great challenges due to the large quantity of information and significant amounts of erroneous data produced.

RESULTS

In this study, Decision Trees (DT) based models were developed to discriminate compound bioactivities by using their chemical structure fingerprints provided in the PubChem system http://pubchem.ncbi.nlm.nih.gov. The DT models were examined for filtering biological activity data contained in four assays deposited in the PubChem Bioassay Database including assays tested for 5HT1a agonists, antagonists, and HIV-1 RT-RNase H inhibitors. The 10-fold Cross Validation (CV) sensitivity, specificity and Matthews Correlation Coefficient (MCC) for the models are 57.2 approximately 80.5%, 97.3 approximately 99.0%, 0.4 approximately 0.5 respectively. A further evaluation was also performed for DT models built for two independent bioassays, where inhibitors for the same HIV RNase target were screened using different compound libraries, this experiment yields enrichment factor of 4.4 and 9.7.

CONCLUSION

Our results suggest that the designed DT models can be used as a virtual screening technique as well as a complement to traditional approaches for hits selection.

摘要

背景

高通量筛选(HTS)技术的最新进展以及利用组合化学生成或源自天然产物的现成化合物库,使得能够在数天内对数百万种化合物进行测试。由于HTS分析产生的信息量巨大,在药物开发研究中挖掘HTS数据以寻找潜在的有价值信息是一项极具挑战性的任务。由于产生的信息量巨大以及大量错误数据,用于分析HTS结果的计算方法面临巨大挑战。

结果

在本研究中,开发了基于决策树(DT)的模型,通过使用美国国立医学图书馆(NLM)的化学数据库(PubChem)系统(http://pubchem.ncbi.nlm.nih.gov)中提供的化合物化学结构指纹来区分化合物的生物活性。对DT模型进行了检验,以筛选PubChem生物分析数据库中四项分析所包含的生物活性数据,这些分析包括针对5HT1a激动剂、拮抗剂和HIV-1逆转录酶-核糖核酸酶H抑制剂的测试。这些模型的10倍交叉验证(CV)灵敏度、特异性和马修斯相关系数(MCC)分别约为57.2%至80.5%、97.3%至99.0%、0.4至0.5。还对为两项独立生物分析构建的DT模型进行了进一步评估,其中使用不同的化合物库筛选针对同一HIV核糖核酸酶靶点的抑制剂,该实验产生的富集因子分别为4.4和9.7。

结论

我们的结果表明,所设计的DT模型可作为一种虚拟筛选技术,也可作为传统命中选择方法的补充。

相似文献

3
PubChem 2019 update: improved access to chemical data.PubChem 2019 年更新:改善化学数据获取。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109. doi: 10.1093/nar/gky1033.
10
Using the BioAssay Ontology for analyzing high-throughput screening data.使用生物测定本体论分析高通量筛选数据。
J Biomol Screen. 2015 Mar;20(3):402-15. doi: 10.1177/1087057114563493. Epub 2014 Dec 15.

引用本文的文献

2
A Perspective on Explanations of Molecular Prediction Models.分子预测模型解释的透视。
J Chem Theory Comput. 2023 Apr 25;19(8):2149-2160. doi: 10.1021/acs.jctc.2c01235. Epub 2023 Mar 27.
5
Predicting Meridian in Chinese traditional medicine using machine learning approaches.运用机器学习方法预测中医经络。
PLoS Comput Biol. 2019 Nov 25;15(11):e1007249. doi: 10.1371/journal.pcbi.1007249. eCollection 2019 Nov.
9
Combination therapeutics in complex diseases.复杂疾病中的联合疗法。
J Cell Mol Med. 2016 Dec;20(12):2231-2240. doi: 10.1111/jcmm.12930. Epub 2016 Sep 7.
10
Mining Chemical Activity Status from High-Throughput Screening Assays.从高通量筛选实验中挖掘化学活性状态
PLoS One. 2015 Dec 14;10(12):e0144426. doi: 10.1371/journal.pone.0144426. eCollection 2015.

本文引用的文献

1
Virtual screening of Chinese herbs with Random Forest.基于随机森林的中药虚拟筛选
J Chem Inf Model. 2007 Mar-Apr;47(2):264-78. doi: 10.1021/ci600289v.
6
Statistical analysis of systematic errors in high-throughput screening.高通量筛选中系统误差的统计分析
J Biomol Screen. 2005 Sep;10(6):557-67. doi: 10.1177/1087057105276989. Epub 2005 Aug 15.
9
Pursuing the leadlikeness concept in pharmaceutical research.在药物研究中追求类先导物概念。
Curr Opin Chem Biol. 2004 Jun;8(3):255-63. doi: 10.1016/j.cbpa.2004.04.003.
10
Cell-based partitioning.基于细胞的分区
Methods Mol Biol. 2004;275:279-90. doi: 10.1385/1-59259-802-1:279.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验