混合 IC₅₀ 数据的可比性——统计分析。

Comparability of mixed IC₅₀ data - a statistical analysis.

机构信息

Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Basel, Switzerland.

出版信息

PLoS One. 2013 Apr 16;8(4):e61007. doi: 10.1371/journal.pone.0061007. Print 2013.

DOI:10.1371/journal.pone.0061007

PMID:23613770

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3628986/

Abstract

The biochemical half maximal inhibitory concentration (IC50) is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions. For large scale analysis it is not feasible to check each data entry manually and it is very tempting to mix all available IC50 values from public database even if assay information is not reported. As previously reported for Ki database analysis, we first analyzed the types of errors, the redundancy and the variability that can be found in ChEMBL IC50 database. For assessing the variability of IC50 data independently measured in two different labs at least ten IC50 data for identical protein-ligand systems against the same target were searched in ChEMBL. As a not sufficient number of cases of this type are available, the variability of IC50 data was assessed by comparing all pairs of independent IC50 measurements on identical protein-ligand systems. The standard deviation of IC50 data is only 25% larger than the standard deviation of Ki data, suggesting that mixing IC50 data from different assays, even not knowing assay conditions details, only adds a moderate amount of noise to the overall data. The standard deviation of public ChEMBL IC50 data, as expected, resulted greater than the standard deviation of in-house intra-laboratory/inter-day IC50 data. Augmenting mixed public IC50 data by public Ki data does not deteriorate the quality of the mixed IC50 data, if the Ki is corrected by an offset. For a broad dataset such as ChEMBL database a Ki- IC50 conversion factor of 2 was found to be the most reasonable.

摘要

生化半数最大抑制浓度（IC50）是在先导化合物优化中最常用于靶标活性的指标。它用于指导先导化合物优化，基于公共数据构建大规模的化学生物基因组分析、非靶标活性和毒性模型。然而，使用公共生化 IC50 数据存在问题，因为它们是特定于检测方法的，并且仅在某些条件下才具有可比性。对于大规模分析，手动检查每个数据条目是不可行的，因此即使没有报告检测信息，也非常诱人将所有可用的公共数据库中的 IC50 值混合在一起。正如之前对 Ki 数据库分析的报道，我们首先分析了 ChEMBL IC50 数据库中可能存在的错误类型、冗余性和可变性。为了评估在两个不同实验室独立测量的 IC50 数据的可变性，在 ChEMBL 中搜索了至少十个相同蛋白配体系统针对相同靶标在两个不同实验室中独立测量的 IC50 数据。由于这种情况的数量不足，因此通过比较相同蛋白配体系统的所有独立 IC50 测量值来评估 IC50 数据的可变性。IC50 数据的标准偏差仅比 Ki 数据的标准偏差大 25%，这表明即使不知道检测条件的细节，混合来自不同检测方法的 IC50 数据只会给整体数据增加适度的噪声。正如预期的那样，公共 ChEMBL IC50 数据的标准偏差大于内部实验室/日间 IC50 数据的标准偏差。如果 Ki 通过偏移进行校正，则混合公共 Ki 数据不会降低混合 IC50 数据的质量。对于像 ChEMBL 数据库这样广泛的数据集，发现 Ki-IC50 转换因子为 2 是最合理的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa29/3628986/9a991178c092/pone.0061007.g001.jpg

相似文献

Comparability of mixed IC₅₀ data - a statistical analysis.混合 IC₅₀ 数据的可比性——统计分析。

PLoS One. 2013 Apr 16;8(4):e61007. doi: 10.1371/journal.pone.0061007. Print 2013.

Combining IC or Values from Different Sources Is a Source of Significant Noise.合并来自不同来源的IC或值是显著噪声的一个来源。

J Chem Inf Model. 2024 Mar 11;64(5):1560-1567. doi: 10.1021/acs.jcim.4c00049. Epub 2024 Feb 23.

How Consistent are Publicly Reported Cytotoxicity Data? Large-Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements.公开报告的细胞毒性数据有多一致？对公开独立细胞毒性测量一致性的大规模统计分析。

ChemMedChem. 2016 Jan 5;11(1):57-71. doi: 10.1002/cmdc.201500424. Epub 2015 Nov 6.

Big Data Challenges Targeting Proteins in GPCR Signaling Pathways; Combining PTML-ChEMBL Models and [S]GTPγS Binding Assays.针对 G 蛋白偶联受体信号通路中蛋白质的大数据挑战；结合 PTML-ChEMBL 模型和 [S]GTPγS 结合测定法。

ACS Chem Neurosci. 2019 Nov 20;10(11):4476-4491. doi: 10.1021/acschemneuro.9b00302. Epub 2019 Nov 4.

Activity, assay and target data curation and quality in the ChEMBL database.ChEMBL数据库中的活性、测定及靶点数据整理与质量

J Comput Aided Mol Des. 2015 Sep;29(9):885-96. doi: 10.1007/s10822-015-9860-5. Epub 2015 Jul 23.

A drug target slim: using gene ontology and gene ontology annotations to navigate protein-ligand target space in ChEMBL.药物靶点精简：利用基因本体论和基因本体注释在ChEMBL中探索蛋白质-配体靶点空间

J Biomed Semantics. 2016 Sep 27;7(1):59. doi: 10.1186/s13326-016-0102-0.

[Meta-analysis of the Italian studies on short-term effects of air pollution].[意大利关于空气污染短期影响研究的荟萃分析]

Epidemiol Prev. 2001 Mar-Apr;25(2 Suppl):1-71.

The Reliability of Estimating Ki Values for Direct, Reversible Inhibition of Cytochrome P450 Enzymes from Corresponding IC50 Values: A Retrospective Analysis of 343 Experiments.从相应的IC50值估算细胞色素P450酶直接可逆抑制的Ki值的可靠性：343项实验的回顾性分析

Drug Metab Dispos. 2015 Nov;43(11):1744-50. doi: 10.1124/dmd.115.066597. Epub 2015 Sep 9.

Determination of the warfarin inhibition constant Ki for vitamin K 2,3-epoxide reductase complex subunit-1 (VKORC1) using an in vitro DTT-driven assay.使用体外二硫苏糖醇（DTT）驱动的检测方法测定华法林对维生素K 2,3-环氧化物还原酶复合物亚基-1（VKORC1）的抑制常数Ki 。

Biochim Biophys Acta. 2013 Aug;1830(8):4202-10. doi: 10.1016/j.bbagen.2013.04.018. Epub 2013 Apr 22.

Perturbation Theory/Machine Learning Model of ChEMBL Data for Dopamine Targets: Docking, Synthesis, and Assay of New l-Prolyl-l-leucyl-glycinamide Peptidomimetics.ChEMBL 数据的微扰理论/机器学习模型用于多巴胺靶点：新型 l-脯氨酰-l-亮氨酰-甘氨酰胺类肽类似物的对接、合成和测定。

ACS Chem Neurosci. 2018 Nov 21;9(11):2572-2587. doi: 10.1021/acschemneuro.8b00083. Epub 2018 Jun 25.

引用本文的文献

Machine learning analysis of ARVC informed by sodium channel protein-based interactome networks.基于钠通道蛋白相互作用组网络的致心律失常性右室心肌病机器学习分析

Front Pharmacol. 2025 Jul 23;16:1611342. doi: 10.3389/fphar.2025.1611342. eCollection 2025.

Sequence-based virtual screening using transformers.基于序列的使用变压器的虚拟筛选。

Nat Commun. 2025 Jul 28;16(1):6925. doi: 10.1038/s41467-025-61833-8.

A computational dynamic model of combination treatment for type II inhibitors with asciminib.阿伐替尼与II型抑制剂联合治疗的计算动力学模型。

Protein Sci. 2025 Aug;34(8):e70219. doi: 10.1002/pro.70219.

Toward Assay-Aware Bioactivity Model(er)s: Getting a Grip on Biological Context.迈向可感知分析的生物活性模型：把握生物学背景。

J Chem Inf Model. 2025 Jul 14;65(13):7013-7023. doi: 10.1021/acs.jcim.5c00603. Epub 2025 Jun 30.

Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World.利用化学结构进行毒性预测的机器学习：在现实世界中取得成功的支柱。

Chem Res Toxicol. 2025 May 19;38(5):759-807. doi: 10.1021/acs.chemrestox.5c00033. Epub 2025 May 2.

Chemical Space Exploration and Machine Learning-Based Screening of PDE7A Inhibitors.基于化学空间探索和机器学习的PDE7A抑制剂筛选

Pharmaceuticals (Basel). 2025 Mar 21;18(4):444. doi: 10.3390/ph18040444.

Data Exploration for Target Predictions Using Proprietary and Publicly Available Data Sets.使用专有数据集和公开可用数据集进行目标预测的数据探索

Chem Res Toxicol. 2025 May 19;38(5):820-833. doi: 10.1021/acs.chemrestox.4c00347. Epub 2025 Apr 20.

Proteomic Learning of Gamma-Aminobutyric Acid (GABA) Receptor-Mediated Anesthesia.γ-氨基丁酸（GABA）受体介导麻醉的蛋白质组学研究

J Chem Inf Model. 2025 Apr 14;65(7):3655-3668. doi: 10.1021/acs.jcim.5c00114. Epub 2025 Mar 17.

CardioGenAI: a machine learning-based framework for re-engineering drugs for reduced hERG liability.CardioGenAI：一种基于机器学习的框架，用于重新设计药物以降低hERG风险。

J Cheminform. 2025 Mar 5;17(1):30. doi: 10.1186/s13321-025-00976-8.

Machine Learning Tool for New Selective Serotonin and Serotonin-Norepinephrine Reuptake Inhibitors.新型选择性5-羟色胺及5-羟色胺-去甲肾上腺素再摄取抑制剂的机器学习工具

Molecules. 2025 Jan 31;30(3):637. doi: 10.3390/molecules30030637.

本文引用的文献

Annotating Human P-Glycoprotein Bioassay Data.注释人类P-糖蛋白生物测定数据。

Mol Inform. 2012 Aug;31(8):599-609. doi: 10.1002/minf.201200059. Epub 2012 Aug 7.

Kinome-wide activity modeling from diverse public high-quality data sets.从多样化的公共高质量数据集进行激酶组范围的活性建模。

J Chem Inf Model. 2013 Jan 28;53(1):27-38. doi: 10.1021/ci300403k. Epub 2013 Jan 9.

Automated design of ligands to polypharmacological profiles.配体的多药效特性自动化设计。

Nature. 2012 Dec 13;492(7428):215-20. doi: 10.1038/nature11691.

QSARs, data and error in the modern age of drug discovery.QSARs、数据与现代药物发现中的错误

Curr Top Med Chem. 2012;12(17):1896-902. doi: 10.2174/156802612804547380.

Growth of ligand-target interaction data in ChEMBL is associated with increasing and activity measurement-dependent compound promiscuity.ChEMBL 中配体-靶标相互作用数据的增长与化合物的普遍反应性（即与多种靶点相互作用的能力）的增加和基于活性的测量方法有关。

J Chem Inf Model. 2012 Oct 22;52(10):2550-8. doi: 10.1021/ci3003304. Epub 2012 Sep 28.

The experimental uncertainty of heterogeneous public K(i) data.异质公共 K(i) 数据的实验不确定性。

J Med Chem. 2012 Jun 14;55(11):5165-73. doi: 10.1021/jm300131x. Epub 2012 May 29.

Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms.计算预测代谢：部位、产物、SAR、P450 酶动力学和机制。

J Chem Inf Model. 2012 Mar 26;52(3):617-48. doi: 10.1021/ci200542m. Epub 2012 Feb 17.

ChEMBL: a large-scale bioactivity database for drug discovery.ChEMBL：用于药物发现的大型生物活性数据库。

Nucleic Acids Res. 2012 Jan;40(Database issue):D1100-7. doi: 10.1093/nar/gkr777. Epub 2011 Sep 23.

CSAR benchmark exercise of 2010: combined evaluation across all submitted scoring functions.2010 年的 CSAR 基准测试练习：所有提交的评分函数的综合评估。

J Chem Inf Model. 2011 Sep 26;51(9):2115-31. doi: 10.1021/ci200269q. Epub 2011 Aug 29.

CSAR benchmark exercise of 2010: selection of the protein-ligand complexes.2010 年 CSAR 基准测试练习：蛋白质-配体复合物的选择。

J Chem Inf Model. 2011 Sep 26;51(9):2036-46. doi: 10.1021/ci200082t. Epub 2011 Jul 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

混合 IC₅₀ 数据的可比性——统计分析。

Comparability of mixed IC₅₀ data - a statistical analysis.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献