• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于药物分类的对大型药物网络进行深度分析的模型。

A model with deep analysis on a large drug network for drug classification.

作者信息

Wu Chenhao, Chen Lei

机构信息

College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.

出版信息

Math Biosci Eng. 2023 Jan;20(1):383-401. doi: 10.3934/mbe.2023018. Epub 2022 Oct 9.

DOI:10.3934/mbe.2023018
PMID:36650771
Abstract

Drugs are an important means to treat various diseases. They are classified into several classes to indicate their properties and effects. Those in the same class always share some important features. The Kyoto Encyclopedia of Genes and Genomes (KEGG) DRUG recently reported a new drug classification system that classifies drugs into 14 classes. Correct identification of the class for any possible drug-like compound is helpful to roughly determine its effects for a particular type of disease. Experiments could be conducted to confirm such latent effects, thus accelerating the procedures for discovering novel drugs. In this study, this classification system was investigated. A classification model was proposed to assign one of the classes in the system to any given drug for the first time. Different from traditional fingerprint features, which indicated essential drug properties alone and were very popular in investigating drug-related problems, drugs were represented by novel features derived from a large drug network via a well-known network embedding algorithm called Node2vec. These features abstracted the drug associations generated from their essential properties, and they could overview each drug with all drugs as background. As class sizes were of great differences, synthetic minority over-sampling technique (SMOTE) was employed to tackle the imbalance problem. A balanced dataset was fed into the support vector machine to build the model. The 10-fold cross-validation results suggested the excellent performance of the model. This model was also superior to models using other drug features, including those generated by another network embedding algorithm and fingerprint features. Furthermore, this model provided more balanced performance across all classes than that without SMOTE.

摘要

药物是治疗各种疾病的重要手段。它们被分为几类以表明其特性和效果。同一类别的药物总是具有一些重要特征。京都基因与基因组百科全书(KEGG)药物数据库最近报道了一种新的药物分类系统,该系统将药物分为14类。正确识别任何可能的类药物化合物的类别有助于大致确定其对特定类型疾病的作用。可以进行实验来证实这种潜在作用,从而加速发现新药的进程。在本研究中,对该分类系统进行了研究。首次提出了一种分类模型,用于将该系统中的一个类别分配给任何给定的药物。与传统的指纹特征不同,传统指纹特征仅表明药物的基本特性,在研究药物相关问题中非常流行,而这里的药物是通过一种名为Node2vec的著名网络嵌入算法从一个大型药物网络中衍生出的新特征来表示的。这些特征提取了由药物基本特性产生的药物关联,并且可以以所有药物为背景来全面了解每种药物。由于类别大小差异很大,因此采用合成少数过采样技术(SMOTE)来解决不平衡问题。将一个平衡的数据集输入支持向量机以构建模型。10折交叉验证结果表明该模型具有优异的性能。该模型也优于使用其他药物特征的模型,包括由另一种网络嵌入算法生成的特征和指纹特征。此外,与不使用SMOTE的情况相比,该模型在所有类别上提供了更平衡的性能。

相似文献

1
A model with deep analysis on a large drug network for drug classification.一种用于药物分类的对大型药物网络进行深度分析的模型。
Math Biosci Eng. 2023 Jan;20(1):383-401. doi: 10.3934/mbe.2023018. Epub 2022 Oct 9.
2
iATC-NRAKEL: an efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs.iATC-NRAKEL:一种用于识别药物解剖治疗化学类别的高效多标签分类器。
Bioinformatics. 2020 Mar 1;36(5):1391-1396. doi: 10.1093/bioinformatics/btz757.
3
Drug Target Group Prediction with Multiple Drug Networks.基于多个药物网络的药物靶标群组预测。
Comb Chem High Throughput Screen. 2020;23(4):274-284. doi: 10.2174/1386207322666190702103927.
4
Classification of toxicity effects of biotransformed hepatic drugs using whale optimized support vector machines.使用鲸鱼优化支持向量机对肝脏生物转化药物的毒性效应进行分类
J Biomed Inform. 2017 Apr;68:132-149. doi: 10.1016/j.jbi.2017.03.002. Epub 2017 Mar 8.
5
Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification.深度学习特征提取方法在血液肿瘤亚型分类中的应用。
Int J Environ Res Public Health. 2021 Feb 23;18(4):2197. doi: 10.3390/ijerph18042197.
6
A Network Integration Method for Deciphering the Types of Metabolic Pathway of Chemicals with Heterogeneous Information.一种用于利用异构信息解读化学物质代谢途径类型的网络整合方法。
Comb Chem High Throughput Screen. 2018;21(9):670-680. doi: 10.2174/1386207322666181206112641.
7
Inverse free reduced universum twin support vector machine for imbalanced data classification.用于不平衡数据分类的逆自由约简全域孪生支持向量机
Neural Netw. 2023 Jan;157:125-135. doi: 10.1016/j.neunet.2022.10.003. Epub 2022 Oct 15.
8
Prediction of Drug Combinations with a Network Embedding Method.基于网络嵌入方法的药物组合预测
Comb Chem High Throughput Screen. 2018;21(10):789-797. doi: 10.2174/1386207322666181226170140.
9
DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier.DeepStack-DTIs:使用 LightGBM 特征选择和深度堆叠集成分类器预测药物-靶标相互作用。
Interdiscip Sci. 2022 Jun;14(2):311-330. doi: 10.1007/s12539-021-00488-7. Epub 2021 Nov 3.
10
LVQ-SMOTE - Learning Vector Quantization based Synthetic Minority Over-sampling Technique for biomedical data.LVQ-SMOTE - 基于学习向量量化的生物医学数据合成少数类过采样技术。
BioData Min. 2013 Oct 2;6(1):16. doi: 10.1186/1756-0381-6-16.

引用本文的文献

1
Exploring Prognostic Gene Factors in Breast Cancer via Machine Learning.通过机器学习探索乳腺癌的预后基因因素。
Biochem Genet. 2024 Dec;62(6):5022-5050. doi: 10.1007/s10528-024-10712-w. Epub 2024 Feb 21.
2
PredictEFC: a fast and efficient multi-label classifier for predicting enzyme family classes.PredictEFC:一种用于预测酶家族类别的快速高效的多标签分类器。
BMC Bioinformatics. 2024 Jan 30;25(1):50. doi: 10.1186/s12859-024-05665-1.
3
Identification of key gene expression associated with quality of life after recovery from COVID-19.
鉴定与 COVID-19 康复后生活质量相关的关键基因表达。
Med Biol Eng Comput. 2024 Apr;62(4):1031-1048. doi: 10.1007/s11517-023-02988-8. Epub 2023 Dec 21.
4
Patterns of Gene Expression Profiles Associated with Colorectal Cancer in Colorectal Mucosa by Using Machine Learning Methods.利用机器学习方法分析结直肠黏膜中与结直肠癌相关的基因表达谱模式。
Comb Chem High Throughput Screen. 2024;27(19):2921-2934. doi: 10.2174/0113862073266300231026103844.
5
Identification of Colon Immune Cell Marker Genes Using Machine Learning Methods.使用机器学习方法鉴定结肠免疫细胞标记基因
Life (Basel). 2023 Sep 7;13(9):1876. doi: 10.3390/life13091876.
6
Identification of Gene Markers Associated with COVID-19 Severity and Recovery in Different Immune Cell Subtypes.不同免疫细胞亚型中与COVID-19严重程度和恢复相关的基因标志物的鉴定
Biology (Basel). 2023 Jul 2;12(7):947. doi: 10.3390/biology12070947.
7
Identification of Phase-Separation-Protein-Related Function Based on Gene Ontology by Using Machine Learning Methods.基于基因本体论利用机器学习方法鉴定相分离蛋白相关功能
Life (Basel). 2023 May 31;13(6):1306. doi: 10.3390/life13061306.
8
Machine Learning Classification of Time since BNT162b2 COVID-19 Vaccination Based on Array-Measured Antibody Activity.基于阵列测量抗体活性的BNT162b2 COVID-19疫苗接种后时间的机器学习分类
Life (Basel). 2023 May 31;13(6):1304. doi: 10.3390/life13061304.
9
Using Machine Learning Methods in Identifying Genes Associated with COVID-19 in Cardiomyocytes and Cardiac Vascular Endothelial Cells.利用机器学习方法鉴定心肌细胞和心脏血管内皮细胞中与新冠病毒相关的基因
Life (Basel). 2023 Apr 14;13(4):1011. doi: 10.3390/life13041011.
10
Immune responses of different COVID-19 vaccination strategies by analyzing single-cell RNA sequencing data from multiple tissues using machine learning methods.通过使用机器学习方法分析来自多个组织的单细胞RNA测序数据,研究不同新冠疫苗接种策略的免疫反应。
Front Genet. 2023 Mar 17;14:1157305. doi: 10.3389/fgene.2023.1157305. eCollection 2023.