Suppr超能文献

一个化合物-靶标对数据集:药物、临床候选物和其他生物活性化合物之间的差异。

A compound-target pairs dataset: differences between drugs, clinical candidates and other bioactive compounds.

机构信息

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom.

Paul Leeson Consulting Ltd, Nuneaton, Warwickshire, CV13 6LZ, United Kingdom.

出版信息

Sci Data. 2024 Oct 21;11(1):1160. doi: 10.1038/s41597-024-03582-9.

Abstract

Providing a better understanding of what makes a compound a successful drug candidate is crucial for reducing the high attrition rates in drug discovery. Analyses of the differences between active compounds, clinical candidates and drugs require high-quality datasets. However, most datasets of drug discovery programs are not openly available. This work introduces a dataset of compound-target pairs extracted from the open-source bioactivity database ChEMBL (release 32). Compound-target pairs in the dataset either have at least one measured activity or are part of the manually curated set of known interactions in ChEMBL. Known interactions between drugs or clinical candidates and targets are specifically annotated to facilitate analyses of differences between drugs, clinical candidates, and other active compounds. In total, the dataset comprises 614,594 compound-target pairs, 5,109 (3,932) of which are known interactions between drugs (clinical candidates) and targets. The extraction is performed in an automated manner and fully reproducible. We are providing not only the datasets but also the code to rerun the analyses with other ChEMBL releases.

摘要

更好地了解是什么使化合物成为成功的药物候选物对于降低药物发现中的高淘汰率至关重要。对活性化合物、临床候选物和药物之间差异的分析需要高质量的数据集。然而,大多数药物发现项目的数据集并不公开。这项工作介绍了一个从开源生物活性数据库 ChEMBL(版本 32)中提取的化合物-靶标对数据集。数据集中的化合物-靶标对要么至少有一个测量的活性,要么是 ChEMBL 中手动 curated 的已知相互作用集的一部分。药物或临床候选物与靶标之间的已知相互作用特别被注释,以促进对药物、临床候选物和其他活性化合物之间差异的分析。该数据集总共包含 614,594 个化合物-靶标对,其中 5,109(3,932)个是药物(临床候选物)和靶标之间的已知相互作用。提取是自动化进行的,并且是完全可重复的。我们不仅提供数据集,还提供使用其他 ChEMBL 版本重新运行分析的代码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f021/11494047/9c64fe2376da/41597_2024_3582_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验