Suppr超能文献

计算药物重新定位中标签稀疏性的自监督学习

Self-Supervised Learning for Label Sparsity in Computational Drug Repositioning.

作者信息

Yang Xinxing, Yang Genke, Chu Jian

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Sep-Oct;20(5):3245-3256. doi: 10.1109/TCBB.2023.3254163. Epub 2023 Oct 9.

Abstract

The computational drug repositioning aims to discover new uses for marketed drugs, which can accelerate the drug development process and play an important role in the existing drug discovery system. However, the number of validated drug-disease associations is scarce compared to the number of drugs and diseases in the real world. Too few labeled samples will make the classification model unable to learn effective latent factors of drugs, resulting in poor generalization performance. In this work, we propose a multi-task self-supervised learning framework for computational drug repositioning. The framework tackles label sparsity by learning a better drug representation. Specifically, we take the drug-disease association prediction problem as the main task, and the auxiliary task is to use data augmentation strategies and contrast learning to mine the internal relationships of the original drug features, so as to automatically learn a better drug representation without supervised labels. And through joint training, it is ensured that the auxiliary task can improve the prediction accuracy of the main task. More precisely, the auxiliary task improves drug representation and serving as additional regularization to improve generalization. Furthermore, we design a multi-input decoding network to improve the reconstruction ability of the autoencoder model. We evaluate our model using three real-world datasets. The experimental results demonstrate the effectiveness of the multi-task self-supervised learning framework, and its predictive ability is superior to the state-of-the-art model.

摘要

计算药物重新定位旨在发现已上市药物的新用途,这可以加速药物开发过程,并在现有的药物发现系统中发挥重要作用。然而,与现实世界中的药物和疾病数量相比,经过验证的药物-疾病关联数量稀缺。标记样本过少会使分类模型无法学习到有效的药物潜在因素,导致泛化性能较差。在这项工作中,我们提出了一种用于计算药物重新定位的多任务自监督学习框架。该框架通过学习更好的药物表示来解决标签稀疏问题。具体来说,我们将药物-疾病关联预测问题作为主要任务,辅助任务是使用数据增强策略和对比学习来挖掘原始药物特征的内部关系,从而在无监督标签的情况下自动学习更好的药物表示。并且通过联合训练,确保辅助任务能够提高主要任务的预测准确性。更确切地说,辅助任务改进药物表示并作为额外的正则化来提高泛化能力。此外,我们设计了一个多输入解码网络来提高自动编码器模型的重构能力。我们使用三个真实世界的数据集对我们的模型进行评估。实验结果证明了多任务自监督学习框架的有效性,其预测能力优于当前最先进的模型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验