Jayagopal Aishwarya, Walsh Robert J, Hariprasannan Krishna Kumar, Mariappan Ragunathan, Mahapatra Debabrata, Jaynes Patrick William, Lim Diana, Peng Tan David Shao, Tan Tuan Zea, Pitt Jason J, Jeyasekharan Anand D, Rajan Vaibhav
Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore.
Department of Haematology-Oncology, National University Cancer Institute, NUHS Tower Block, Level 7, 1E Kent Ridge Road, Singapore 119228, Singapore.
iScience. 2025 Feb 11;28(3):111992. doi: 10.1016/j.isci.2025.111992. eCollection 2025 Mar 21.
Next-generation sequencing (NGS) is increasingly utilized in oncological practice; however, only a minority of patients benefit from targeted therapy. Developing drug response prediction (DRP) models is important for the "untargetable" majority. Prior DRP models typically use whole-transcriptome and whole-exome sequencing data, which are clinically unavailable. We aim to develop a DRP model toward the repurposing of chemotherapy, requiring only information from clinical-grade NGS (cNGS) panels of restricted gene sets. Data sparsity and limited patient drug response information make this challenging. We firstly show that existing DRPs perform equally with whole-exome versus cNGS (∼300 genes) data. Drug IDentifier (DruID) is then described, a DRP model for restricted gene sets using transfer learning, variant annotations, domain-invariant representation learning, and multi-task learning. DruID outperformed state-of-the-art DRP methods on pan-cancer data and showed robust response classification on two real-world clinical datasets, representing a step toward a clinically applicable DRP tool.
下一代测序(NGS)在肿瘤学实践中的应用越来越广泛;然而,只有少数患者能从靶向治疗中获益。对于大多数“无法靶向治疗”的患者来说,开发药物反应预测(DRP)模型至关重要。先前的DRP模型通常使用全转录组和全外显子组测序数据,而这些数据在临床上无法获取。我们旨在开发一种用于化疗药物重新利用的DRP模型,该模型仅需要来自有限基因集的临床级NGS(cNGS)面板的信息。数据稀疏性和有限的患者药物反应信息使得这一任务具有挑战性。我们首先表明,现有的DRP在全外显子组数据与cNGS(约300个基因)数据上表现相当。然后介绍了Drug IDentifier(DruID),这是一种使用迁移学习、变异注释、域不变表示学习和多任务学习的针对有限基因集的DRP模型。DruID在泛癌数据上优于最先进的DRP方法,并在两个真实世界临床数据集上表现出强大的反应分类能力,代表了朝着临床适用的DRP工具迈出的一步。