National Center for Biotechnology Information, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
Bioinformatics. 2010 Nov 15;26(22):2881-8. doi: 10.1093/bioinformatics/btq550. Epub 2010 Oct 13.
MOTIVATION: Most of the previous data mining studies based on the NCI-60 dataset, due to its intrinsic cell-based nature, can hardly provide insights into the molecular targets for screened compounds. On the other hand, the abundant information of the compound-target associations in PubChem can offer extensive experimental evidence of molecular targets for tested compounds. Therefore, by taking advantages of the data from both public repositories, one may investigate the correlations between the bioactivity profiles of small molecules from the NCI-60 dataset (cellular level) and their patterns of interactions with relevant protein targets from PubChem (molecular level) simultaneously. RESULTS: We investigated a set of 37 small molecules by providing links among their bioactivity profiles, protein targets and chemical structures. Hierarchical clustering of compounds was carried out based on their bioactivity profiles. We found that compounds were clustered into groups with similar mode of actions, which strongly correlated with chemical structures. Furthermore, we observed that compounds similar in bioactivity profiles also shared similar patterns of interactions with relevant protein targets, especially when chemical structures were related. The current work presents a new strategy for combining and data mining the NCI-60 dataset and PubChem. This analysis shows that bioactivity profile comparison can provide insights into the mode of actions at the molecular level, thus will facilitate the knowledge-based discovery of novel compounds with desired pharmacological properties. AVAILABILITY: The bioactivity profiling data and the target annotation information are publicly available in the PubChem BioAssay database (ftp://ftp.ncbi.nlm.nih.gov/pubchem/Bioassay/).
动机:由于 NCI-60 数据集固有的基于细胞的特性,之前大多数基于该数据集的数据挖掘研究几乎无法深入了解筛选化合物的分子靶标。另一方面,PubChem 中丰富的化合物-靶标关联信息可以为测试化合物的分子靶标提供广泛的实验证据。因此,通过利用来自两个公共存储库的数据,人们可以同时研究来自 NCI-60 数据集的小分子的生物活性谱(细胞水平)与其与 PubChem 中相关蛋白靶标的相互作用模式(分子水平)之间的相关性。
结果:我们通过提供生物活性谱、蛋白质靶标和化学结构之间的联系来研究了一组 37 个小分子。根据它们的生物活性谱进行了化合物的层次聚类。我们发现,化合物按照作用方式相似的方式聚类,这与化学结构密切相关。此外,我们观察到具有相似生物活性谱的化合物与相关蛋白靶标也具有相似的相互作用模式,特别是在化学结构相关的情况下。目前的工作提出了一种结合和挖掘 NCI-60 数据集和 PubChem 的新策略。该分析表明,生物活性谱比较可以深入了解分子水平的作用方式,从而有助于基于知识发现具有理想药理特性的新型化合物。
可用性:生物活性谱数据和靶标注释信息可在 PubChem BioAssay 数据库(ftp://ftp.ncbi.nlm.nih.gov/pubchem/Bioassay/)中公开获得。
Nucleic Acids Res. 2009-7
Nucleic Acids Res. 2009-11-19
Drug Discov Today. 2010-10-21
Curr Protoc. 2021-8
Biochem Soc Trans. 2011-10
Nucleic Acids Res. 2016-1-4
Nucleic Acids Res. 2011-12-2
Antioxidants (Basel). 2022-5-27
BMC Med Genomics. 2018-2-6
PLoS One. 2017-2-8
J Chem Inf Model. 2016-7-25
J Chem Inf Model. 2014-2-5
Nucleic Acids Res. 2009-11-19
J Mol Graph Model. 2009-10-12
Bioinformatics. 2009-10-13
J Chem Inf Model. 2009-9
Nucleic Acids Res. 2009-7
J Chem Inf Model. 2008-7
Expert Rev Anticancer Ther. 2008-5