Suppr超能文献

SSL-VQ:用于跨多种疾病进行治疗靶点半监督预测的矢量量化变分自编码器。

SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases.

作者信息

Namba Satoko, Li Chen, Yuyama Otani Noriko, Yamanishi Yoshihiro

机构信息

Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Kawazu, Iizuka, Fukuoka, 820-8502, Japan.

Department of Complex Systems Science, Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, Aichi, 464-8601, Japan.

出版信息

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf039.

Abstract

MOTIVATION

Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases).

RESULTS

This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target-disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model's applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target-disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases.

AVAILABILITY AND IMPLEMENTATION

Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.

摘要

动机

确定有效的治疗靶点是药物研发中的一项挑战,尤其是对于没有已知治疗靶点的未明确疾病(如罕见病、难治性疾病)。

结果

本研究提出了一种新颖的机器学习方法,使用多模态向量量化变分自编码器(VQ-VAE)来预测跨疾病的治疗靶点分子。为了解决已知治疗靶点与疾病关联信息的缺乏问题,我们在半监督学习(SSL)框架中纳入了关于没有已知靶点的未明确疾病或没有已知适应症(适用疾病)的未明确蛋白质的信息。该方法在转录组水平上将疾病特异性和蛋白质扰动谱与基因扰动(如基因敲低和基因过表达)整合在一起。由VQ-VAE推动的跨细胞表示学习,用于从不同人类细胞类型的蛋白质扰动谱中提取信息特征。同时,利用VQ-VAE进行跨疾病表示学习,从疾病特异性谱中提取反映疾病状态的信息特征。通过考虑疾病特异性和患者特异性特征之间的一致性,增强了该模型对未明确疾病或蛋白质的适用性。该方法的有效性在针对79种疾病的三种实际场景中得到了证明:靶点-疾病对中的靶点重新定位、未明确疾病的新靶点预测以及未明确蛋白质的新适应症预测。该方法有望在识别各种疾病的治疗靶点方面具有价值。

可用性和实现方式

代码:github.com/YamanishiLab/SSL-VQ ,数据:10.5281/zenodo.14644837 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a6d/11842052/f443524b9396/btaf039f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验