Narykov Oleksandr, Zhu Yitan, Brettin Thomas, Evrard Yvonne A, Partin Alexander, Xia Fangfang, Shukla Maulik, Vasanthakumari Priyanka, Doroshow James H, Stevens Rick L
Computing, Environment and Life Sciences, Argonne National Laboratory, 9700 S Cass Ave, Lemont, IL 60439, United States.
Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, 8560 Progress Drive, Frederick, MD 21702, United States.
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf134.
Drug response prediction (DRP) methods tackle the complex task of associating the effectiveness of small molecules with the specific genetic makeup of the patient. Anti-cancer DRP is a particularly challenging task requiring costly experiments as underlying pathogenic mechanisms are broad and associated with multiple genomic pathways. The scientific community has exerted significant efforts to generate public drug screening datasets, giving a path to various machine learning models that attempt to reason over complex data space of small compounds and biological characteristics of tumors. However, the data depth is still lacking compared to application domains like computer vision or natural language processing domains, limiting current learning capabilities. To combat this issue and improves the generalizability of the DRP models, we are exploring strategies that explicitly address the imbalance in the DRP datasets. We reframe the problem as a multi-objective optimization across multiple drugs to maximize deep learning model performance. We implement this approach by constructing Multi-Objective Optimization Regularized by Loss Entropy loss function and plugging it into a Deep Learning model. We demonstrate the utility of proposed drug discovery methods and make suggestions for further potential application of the work to achieve desirable outcomes in the healthcare field.
药物反应预测(DRP)方法致力于解决将小分子药物的有效性与患者特定基因组成相关联这一复杂任务。抗癌药物反应预测是一项极具挑战性的任务,由于潜在的致病机制广泛且与多种基因组途径相关,因此需要进行成本高昂的实验。科学界已付出巨大努力来生成公共药物筛选数据集,为各种机器学习模型开辟了道路,这些模型试图对小分子化合物的复杂数据空间和肿瘤的生物学特征进行推理。然而,与计算机视觉或自然语言处理等应用领域相比,数据深度仍然不足,限制了当前的学习能力。为了解决这个问题并提高DRP模型的通用性,我们正在探索明确解决DRP数据集中不平衡问题的策略。我们将该问题重新构建为跨多种药物的多目标优化问题,以最大化深度学习模型的性能。我们通过构建由损失熵正则化的多目标优化损失函数并将其插入深度学习模型来实现这一方法。我们展示了所提出的药物发现方法的实用性,并对该工作在医疗保健领域实现理想成果的进一步潜在应用提出了建议。