Abok Jeremiah I, Edwards Jeremy S, Yang Jeremy J
Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM, United States.
Department of Internal Medicine, Division of Translational Informatics, University of New Mexico Health Sciences Center, Albuquerque, NM, United States.
Front Bioinform. 2025 Jun 9;5:1579865. doi: 10.3389/fbinf.2025.1579865. eCollection 2025.
Identifying disease-target associations is a pivotal step in drug discovery, offering insights that guide the development and optimization of therapeutic interventions. Clinical trial data serves as a valuable source for inferring these associations. However, issues such as inconsistent data quality and limited interpretability pose significant challenges. To overcome these limitations, an integrated approach is required that consolidates evidence from diverse data sources to support the effective prioritization of biological targets for further research.
We developed a comprehensive data integration and visualization pipeline to infer and evaluate associations between diseases and known and potential drug targets. This pipeline integrates clinical trial data with standardized metadata, providing an analytical workflow that enables the exploration of diseases linked to specific drug targets as well as facilitating the discovery of drug targets associated with specific diseases. The pipeline employs robust aggregation techniques to consolidate multivariate evidence from multiple studies, leveraging harmonized datasets to ensure consistency and reliability. Disease-target associations are systematically ranked and filtered using a rational scoring framework that assigns confidence scores derived from aggregated statistical metrics.
Our pipeline evaluates disease-target associations by linking protein-coding genes to diseases and incorporates a confidence assessment method based on aggregated evidence. Metrics such as meanRank scores are employed to prioritize associations, enabling researchers to focus on the most promising hypotheses. This systematic approach streamlines the identification and prioritization of biological targets, enhancing hypothesis generation and evidence-based decision-making.
This innovative pipeline provides a scalable solution for hypothesis generation, scoring, and ranking in drug discovery. As an open-source tool, it is equipped with publicly available datasets and designed for ease of use by researchers. The platform empowers scientists to make data-driven decisions in the prioritization of biological targets, facilitating the discovery of novel therapeutic opportunities.
识别疾病-靶点关联是药物研发中的关键步骤,能为治疗干预措施的开发和优化提供指导。临床试验数据是推断这些关联的宝贵来源。然而,数据质量不一致和可解释性有限等问题带来了重大挑战。为克服这些限制,需要一种综合方法,整合来自不同数据源的证据,以支持对生物靶点进行有效排序,以便进一步研究。
我们开发了一个全面的数据整合与可视化流程,用于推断和评估疾病与已知及潜在药物靶点之间的关联。该流程将临床试验数据与标准化元数据相结合,提供了一个分析工作流程,既能探索与特定药物靶点相关的疾病,又有助于发现与特定疾病相关的药物靶点。该流程采用强大的聚合技术,整合来自多项研究的多变量证据,利用协调一致的数据集确保一致性和可靠性。使用基于聚合统计指标得出置信分数的合理评分框架,对疾病-靶点关联进行系统排序和筛选。
我们的流程通过将蛋白质编码基因与疾病联系起来评估疾病-靶点关联,并纳入基于聚合证据的置信度评估方法。使用诸如平均排名分数等指标对关联进行排序,使研究人员能够专注于最有前景的假设。这种系统方法简化了生物靶点的识别和排序,增强了假设生成和基于证据的决策。
这种创新的流程为药物研发中的假设生成、评分和排序提供了一个可扩展的解决方案。作为一个开源工具,它配备了公开可用的数据集,并且设计得便于研究人员使用。该平台使科学家能够在生物靶点排序中做出数据驱动的决策,促进新治疗机会的发现。