利用随机森林技术整合各种蛋白质相似性，推断增强的药物-蛋白质矩阵，以提高药物-疾病关联预测。

Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction.

机构信息

Program in Bioinformatics and Computational Biology, 214088Graduate School, Chulalongkorn University, Bangkok, Thailand.

Advanced Virtual and Intelligent Computing (AVIC) center, Department of Mathematics and Computer Science, 133942Faculty of Science, Chulalongkorn University, Bangkok, Thailand.

出版信息

Sci Prog. 2022 Jul-Sep;105(3):368504221109215. doi: 10.1177/00368504221109215.

DOI:10.1177/00368504221109215

PMID:35801312

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10358641/

Abstract

Identifying new therapeutic indications for existing drugs is a major challenge in drug repositioning. Most computational drug repositioning methods focus on known targets. Analyzing multiple aspects of various protein associations provides an opportunity to discover underlying drug-associated proteins that can be used to improve the performance of the drug repositioning approaches. In this study, machine learning models were developed based on the similarities of diversified biological features, including protein interaction, topological network, sequence alignment, and biological function to predict protein pairs associating with the same drugs. The crucial set of features was identified, and the high performances of protein pair predictions were achieved with an area under the curve (AUC) value of more than 93%. Based on drug chemical structures, the drug similarity levels of the promising protein pairs were used to quantify the inferred drug-associated proteins. Furthermore, these proteins were employed to establish an augmented drug-protein matrix to enhance the efficiency of three existing drug repositioning techniques: a similarity constrained matrix factorization for the drug-disease associations (SCMFDD), an ensemble meta-paths and singular value decomposition (EMP-SVD) model, and a topology similarity and singular value decomposition (TS-SVD) technique. The results showed that the augmented matrix helped to improve the performance up to 4% more in comparison to the original matrix for SCMFDD and EMP-SVD, and about 1% more for TS-SVD. In summary, inferring new protein pairs related to the same drugs increase the opportunity to reveal missing drug-associated proteins that are important for drug development via the drug repositioning technique.

摘要

鉴定现有药物的新治疗用途是药物重定位的主要挑战。大多数计算药物重定位方法都集中在已知靶点上。分析各种蛋白质关联的多个方面为发现潜在的与药物相关的蛋白质提供了机会，这些蛋白质可用于改进药物重定位方法的性能。在这项研究中，基于多样化的生物学特征（包括蛋白质相互作用、拓扑网络、序列比对和生物学功能）的相似性，开发了机器学习模型，以预测与相同药物相关的蛋白质对。确定了关键特征集，并通过达到超过 93%的曲线下面积（AUC）值实现了蛋白质对预测的高性能。基于药物化学结构，使用有前途的蛋白质对的药物相似性水平来量化推断的与药物相关的蛋白质。此外，这些蛋白质被用于建立增强的药物-蛋白质矩阵，以提高三种现有药物重定位技术的效率：用于药物-疾病关联的相似性约束矩阵分解（SCMFDD）、集成元路径和奇异值分解（EMP-SVD）模型以及拓扑相似性和奇异值分解（TS-SVD）技术。结果表明，与原始矩阵相比，增强矩阵有助于提高 SCMFDD 和 EMP-SVD 的性能，最多提高 4%，而对于 TS-SVD，则提高约 1%。总之，推断与相同药物相关的新蛋白质对增加了通过药物重定位技术揭示对药物开发很重要的缺失与药物相关的蛋白质的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57b0/10358641/85e37ee3dfd5/10.1177_00368504221109215-fig1.jpg

相似文献

Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction.

Sci Prog. 2022 Jul-Sep;105(3):368504221109215. doi: 10.1177/00368504221109215.

Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition.

BMC Bioinformatics. 2019 Mar 29;20(Suppl 3):134. doi: 10.1186/s12859-019-2644-5.

Link Prediction Only With Interaction Data and its Application on Drug Repositioning.

IEEE Trans Nanobioscience. 2020 Jul;19(3):547-555. doi: 10.1109/TNB.2020.2990291. Epub 2020 Apr 24.

Computational drug repositioning based on multi-similarities bilinear matrix factorization.

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa267.

Computational drug repositioning using meta-path-based semantic network analysis.

BMC Syst Biol. 2018 Dec 31;12(Suppl 9):134. doi: 10.1186/s12918-018-0658-7.

Drug-Disease Association Prediction Using Heterogeneous Networks for Computational Drug Repositioning.

Biomolecules. 2022 Oct 17;12(10):1497. doi: 10.3390/biom12101497.

Drug Repositioning by Integrating Known Disease-Gene and Drug-Target Associations in a Semi-supervised Learning Model.

Acta Biotheor. 2018 Dec;66(4):315-331. doi: 10.1007/s10441-018-9325-z. Epub 2018 Apr 26.

Predicting associations among drugs, targets and diseases by tensor decomposition for drug repositioning.

BMC Bioinformatics. 2019 Dec 16;20(Suppl 26):628. doi: 10.1186/s12859-019-3283-6.

Drug repositioning of herbal compounds via a machine-learning approach.

BMC Bioinformatics. 2019 May 29;20(Suppl 10):247. doi: 10.1186/s12859-019-2811-8.

Computational drug repositioning using low-rank matrix approximation and randomized algorithms.

Bioinformatics. 2018 Jun 1;34(11):1904-1912. doi: 10.1093/bioinformatics/bty013.

引用本文的文献

A comprehensive landscape of AI applications in broad-spectrum drug interaction prediction: a systematic review.

J Cheminform. 2025 Sep 19;17(1):141. doi: 10.1186/s13321-025-01093-2.

Drug Saf. 2025 Apr 22. doi: 10.1007/s40264-025-01545-6.

本文引用的文献

Hybrid Deep Learning Based on a Heterogeneous Network Profile for Functional Annotations of Genes.

Int J Mol Sci. 2021 Sep 16;22(18):10019. doi: 10.3390/ijms221810019.

Prediction of Human- Protein Associations From Heterogeneous Network Structures Based on Machine-Learning Approach.

Bioinform Biol Insights. 2021 Jun 16;15:11779322211013350. doi: 10.1177/11779322211013350. eCollection 2021.

Network diffusion with centrality measures to identify disease-related genes.

Math Biosci Eng. 2021 Mar 29;18(3):2909-2929. doi: 10.3934/mbe.2021147.

Dysregulation of Immune Response in Patients With Coronavirus 2019 (COVID-19) in Wuhan, China.

Clin Infect Dis. 2020 Jul 28;71(15):762-768. doi: 10.1093/cid/ciaa248.

Heterogeneous Network Model to Identify Potential Associations Between and Human Proteins.

Int J Mol Sci. 2020 Feb 15;21(4):1310. doi: 10.3390/ijms21041310.

Clinical and biochemical indexes from 2019-nCoV infected patients linked to viral loads and lung injury.

Sci China Life Sci. 2020 Mar;63(3):364-374. doi: 10.1007/s11427-020-1643-8. Epub 2020 Feb 9.

[Analysis of clinical features of 29 patients with 2019 novel coronavirus pneumonia].

Zhonghua Jie He He Hu Xi Za Zhi. 2020 Feb 6;43(0):E005. doi: 10.3760/cma.j.issn.1001-0939.2020.0005.

Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Lancet. 2020 Feb 15;395(10223):497-506. doi: 10.1016/S0140-6736(20)30183-5. Epub 2020 Jan 24.

Predicting drug-target interactions from drug structure and protein sequence using novel convolutional neural networks.

BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):689. doi: 10.1186/s12859-019-3263-x.

The DisGeNET knowledge platform for disease genomics: 2019 update.

Nucleic Acids Res. 2020 Jan 8;48(D1):D845-D855. doi: 10.1093/nar/gkz1021.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用随机森林技术整合各种蛋白质相似性，推断增强的药物-蛋白质矩阵，以提高药物-疾病关联预测。

Integration of various protein similarities using random forest technique to infer augmented drug-protein matrix for enhancing drug-disease association prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献