Data Sciences and Quantitative Biology, Discovery Sciences, Biopharmaceuticals R&D, AstraZeneca, Gaithersburg, MD 20878, United States.
Global Statistical Sciences, Eli Lilly, Indianapolis, IN 46285, United States.
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad683.
The LINCS L1000 project has collected gene expression profiles for thousands of compounds across a wide array of concentrations, cell lines, and time points. However, conventional analysis methods often fall short in capturing the rich information encapsulated within the L1000 transcriptional dose-response data.
We present DOSE-L1000, a database that unravels the potency and efficacy of compound-gene pairs and the intricate landscape of compound-induced transcriptional changes. Our study uses the fitting of over 140 million generalized additive models and robust linear models, spanning the complete spectrum of compounds and landmark genes within the LINCS L1000 database. This systematic approach provides quantitative insights into differential gene expression and the potency and efficacy of compound-gene pairs across diverse cellular contexts. Through examples, we showcase the application of DOSE-L1000 in tasks such as cell line and compound comparisons, along with clustering analyses and predictions of drug-target interactions. DOSE-L1000 fosters applications in drug discovery, accelerating the transition to omics-driven drug development.
DOSE-L1000 is publicly available at https://doi.org/10.5281/zenodo.8286375.
LINCS L1000 项目已经收集了数千种化合物在广泛的浓度、细胞系和时间点的基因表达谱。然而,传统的分析方法往往无法捕捉到 L1000 转录剂量反应数据中所包含的丰富信息。
我们提出了 DOSE-L1000,这是一个数据库,它揭示了化合物-基因对的效力和功效,以及化合物诱导的转录变化的复杂情况。我们的研究使用了超过 1.4 亿个广义加性模型和稳健线性模型的拟合,涵盖了 LINCS L1000 数据库中所有化合物和标志性基因的完整范围。这种系统的方法提供了对不同细胞环境中差异基因表达以及化合物-基因对的效力和功效的定量见解。通过示例,我们展示了 DOSE-L1000 在细胞系和化合物比较、聚类分析和药物-靶标相互作用预测等任务中的应用。DOSE-L1000 促进了药物发现的应用,加速了向基于组学的药物开发的转变。
DOSE-L1000 可在 https://doi.org/10.5281/zenodo.8286375 上公开获取。