• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于融合回归和 XGBoost 分类的产品预测和筛选模型。

Prediction and Screening Model for Products Based on Fusion Regression and XGBoost Classification.

机构信息

College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China.

Institute of Computer Application, China Academy of Engineering Physics, Mianyang 621900, China.

出版信息

Comput Intell Neurosci. 2022 Jul 31;2022:4987639. doi: 10.1155/2022/4987639. eCollection 2022.

DOI:10.1155/2022/4987639
PMID:35958779
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9357736/
Abstract

Performance prediction based on candidates and screening based on predicted performance value are the core of product development. For example, the performance prediction and screening of equipment components and parts are an important guarantee for the reliability of equipment products. The prediction and screening of drug bioactivity value and performance are the keys to pharmaceutical product development. The main reasons for the failure of pharmaceutical discovery are the low bioactivity of the candidate compounds and the deficiencies in their efficacy and safety, which are related to the absorption, distribution, metabolism, excretion, and toxicity (ADMET) of the compounds. Therefore, it is very necessary to quickly and effectively perform systematic bioactivity value prediction and ADMET property evaluation for candidate compounds in the early stage of drug discovery. In this paper, a data-driven pharmaceutical products screening prediction model is proposed to screen drug candidates with higher bioactivity value and better ADMET properties. First, a quantitative prediction method for bioactivity value is proposed using the fusion regression of LGBM and neural network based on backpropagation (BP-NN). Then, the ADMET properties prediction method is proposed using XGBoost. According to the predicted bioactivity value and ADMET properties, the BVAP method is defined to screen the drug candidates. And the screening model is validated on the dataset of antagonized Er active compounds, in which the mean square error (MSE) of fusion regression is 1.1496, the XGBoost prediction accuracy of ADMET properties are 94.0% for Caco-2, 95.7% for CYP3A4, 89.4% for HERG, 88.6% for hob, and 96.2% for Mn. Compared with the commonly used methods for ADMET properties such as SVM, RF, KNN, LDA, and NB, the XGBoost in this paper has the highest prediction accuracy and AUC value, which has better guiding significance and can help screen pharmaceutical product candidates with good bioactivity, pharmacokinetic properties, and safety.

摘要

基于候选物的性能预测和基于预测性能值的筛选是产品开发的核心。例如,设备部件和零件的性能预测和筛选是设备产品可靠性的重要保证。药物生物活性值和性能的预测和筛选是药物产品开发的关键。药物发现失败的主要原因是候选化合物的生物活性低,以及其功效和安全性的不足,这与化合物的吸收、分布、代谢、排泄和毒性(ADMET)有关。因此,在药物发现的早期阶段,非常有必要快速有效地对候选化合物进行系统的生物活性值预测和 ADMET 性质评估。本文提出了一种基于数据驱动的药物产品筛选预测模型,以筛选具有更高生物活性值和更好 ADMET 性质的药物候选物。首先,提出了一种基于 LGBM 和基于反向传播(BP-NN)的神经网络融合回归的生物活性值定量预测方法。然后,提出了一种基于 XGBoost 的 ADMET 性质预测方法。根据预测的生物活性值和 ADMET 性质,定义了 BVAP 方法来筛选药物候选物。并在拮抗 Er 活性化合物数据集上对筛选模型进行了验证,其中融合回归的均方误差(MSE)为 1.1496,XGBoost 对 ADMET 性质的预测准确率分别为 Caco-2 94.0%、CYP3A4 95.7%、HERG 89.4%、HOB 88.6%和 Mn 96.2%。与常用于 ADMET 性质预测的方法(如 SVM、RF、KNN、LDA 和 NB)相比,本文中的 XGBoost 具有最高的预测准确性和 AUC 值,具有更好的指导意义,可以帮助筛选具有良好生物活性、药代动力学性质和安全性的药物产品候选物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/87b692c2b1be/CIN2022-4987639.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/1b0385376ade/CIN2022-4987639.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/2a90b8b03966/CIN2022-4987639.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/b149c1fbeccc/CIN2022-4987639.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/e065aac4b888/CIN2022-4987639.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/d0577ff17b34/CIN2022-4987639.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/47dc6e7888f3/CIN2022-4987639.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/97be8f8ed5c3/CIN2022-4987639.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/87b692c2b1be/CIN2022-4987639.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/1b0385376ade/CIN2022-4987639.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/2a90b8b03966/CIN2022-4987639.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/b149c1fbeccc/CIN2022-4987639.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/e065aac4b888/CIN2022-4987639.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/d0577ff17b34/CIN2022-4987639.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/47dc6e7888f3/CIN2022-4987639.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/97be8f8ed5c3/CIN2022-4987639.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0cf/9357736/87b692c2b1be/CIN2022-4987639.008.jpg

相似文献

1
Prediction and Screening Model for Products Based on Fusion Regression and XGBoost Classification.基于融合回归和 XGBoost 分类的产品预测和筛选模型。
Comput Intell Neurosci. 2022 Jul 31;2022:4987639. doi: 10.1155/2022/4987639. eCollection 2022.
2
A machine learning-based approach to ERα bioactivity and drug ADMET prediction.一种基于机器学习的雌激素受体α生物活性及药物吸收、分布、代谢、排泄和毒性预测方法。
Front Genet. 2023 Jan 4;13:1087273. doi: 10.3389/fgene.2022.1087273. eCollection 2022.
3
Prediction of ADMET Properties of Anti-Breast Cancer Compounds Using Three Machine Learning Algorithms.基于三种机器学习算法预测抗乳腺癌化合物的 ADMET 性质。
Molecules. 2023 Mar 2;28(5):2326. doi: 10.3390/molecules28052326.
4
De-risking drug discovery with ADDME -- avoiding drug development mistakes early.利用 ADDME 降低药物发现风险——尽早避免药物开发错误。
Altern Lab Anim. 2009 Sep;37 Suppl 1:47-55. doi: 10.1177/026119290903701S10.
5
ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity.药物研发中的ADMET评估。第17部分:化学诱导呼吸毒性的定量和定性预测模型的开发。
Mol Pharm. 2017 Jul 3;14(7):2407-2421. doi: 10.1021/acs.molpharmaceut.7b00317. Epub 2017 Jun 21.
6
ADMETlab: a platform for systematic ADMET evaluation based on a comprehensively collected ADMET database.ADMETlab:一个基于全面收集的ADMET数据库进行系统性ADMET评估的平台。
J Cheminform. 2018 Jun 26;10(1):29. doi: 10.1186/s13321-018-0283-x.
7
Interpretable-ADMET: a web service for ADMET prediction and optimization based on deep neural representation.可解释的 ADMET:基于深度神经表示的 ADMET 预测和优化的网络服务。
Bioinformatics. 2022 May 13;38(10):2863-2871. doi: 10.1093/bioinformatics/btac192.
8
ADMET-score - a comprehensive scoring function for evaluation of chemical drug-likeness.ADMET评分——一种用于评估化学药物相似性的综合评分函数。
Medchemcomm. 2018 Nov 30;10(1):148-157. doi: 10.1039/c8md00472b. eCollection 2019 Jan 1.
9
Denoising Drug Discovery Data for Improved Absorption, Distribution, Metabolism, Excretion, and Toxicity Property Prediction.用于改善吸收、分布、代谢、排泄和毒性性质预测的药物发现数据去噪。
J Chem Inf Model. 2024 Aug 26;64(16):6324-6337. doi: 10.1021/acs.jcim.4c00639. Epub 2024 Aug 7.
10
ADMETboost: a web server for accurate ADMET prediction.ADMETboost:一个用于准确 ADMET 预测的网络服务器。
J Mol Model. 2022 Dec 1;28(12):408. doi: 10.1007/s00894-022-05373-8.

引用本文的文献

1
Normalization and Selecting Non-Differentially Expressed Genes Improve Machine Learning Modelling of Cross-Platform Transcriptomic Data.归一化和选择非差异表达基因可改善跨平台转录组数据的机器学习建模
Trans Artif Intell. 2025;1(1). doi: 10.53941/tai.2025.100005. Epub 2025 May 25.
2
Enhancing ERα-targeted compound efficacy in breast cancer threapy with ExplainableAI and GeneticAlgorithm.利用可解释人工智能和遗传算法提高雌激素受体α靶向化合物在乳腺癌治疗中的疗效。
PLoS One. 2025 May 20;20(5):e0319673. doi: 10.1371/journal.pone.0319673. eCollection 2025.
3
Normalization and selecting non-differentially expressed genes improve machine learning modelling of cross-platform transcriptomic data.

本文引用的文献

1
Discovery of new anticancer thiourea-azetidine hybrids: design, synthesis, in vitro antiproliferative, SAR, in silico molecular docking against VEGFR-2, ADMET, toxicity, and DFT studies.新型抗癌硫脲-氮杂环丁烷杂合体的发现:设计、合成、体外抗增殖、SAR、针对 VEGFR-2 的计算机分子对接、ADMET、毒性和 DFT 研究。
Bioorg Chem. 2021 Oct;115:105206. doi: 10.1016/j.bioorg.2021.105206. Epub 2021 Jul 27.
2
Artificial Breath Classification Using XGBoost Algorithm for Diabetes Detection.基于 XGBoost 算法的人工呼吸分类在糖尿病检测中的应用。
Sensors (Basel). 2021 Jun 18;21(12):4187. doi: 10.3390/s21124187.
3
Benchmarks for interpretation of QSAR models.
标准化和选择非差异表达基因可改善跨平台转录组数据的机器学习建模。
ArXiv. 2025 Jan 24:arXiv:2501.14248v1.
4
Prediction of ADMET Properties of Anti-Breast Cancer Compounds Using Three Machine Learning Algorithms.基于三种机器学习算法预测抗乳腺癌化合物的 ADMET 性质。
Molecules. 2023 Mar 2;28(5):2326. doi: 10.3390/molecules28052326.
5
Double-head transformer neural network for molecular property prediction.用于分子性质预测的双头变压器神经网络。
J Cheminform. 2023 Feb 23;15(1):27. doi: 10.1186/s13321-023-00700-4.
定量构效关系(QSAR)模型解释的基准。
J Cheminform. 2021 May 26;13(1):41. doi: 10.1186/s13321-021-00519-x.
4
Development and interpretation of a QSAR model for in vitro breast cancer (MCF-7) cytotoxicity of 2-phenylacrylonitriles.建立和解释 2-苯丙烯腈类化合物体外乳腺癌(MCF-7)细胞毒性的定量构效关系模型。
J Comput Aided Mol Des. 2021 May;35(5):613-628. doi: 10.1007/s10822-021-00387-5. Epub 2021 May 4.
5
An evaluation of machine learning classifiers for next-generation, continuous-ethogram smart trackers.用于下一代连续行为图谱智能追踪器的机器学习分类器评估。
Mov Ecol. 2021 Mar 30;9(1):15. doi: 10.1186/s40462-021-00245-x.
6
Multiple machine learning, molecular docking, and ADMET screening approach for identification of selective inhibitors of CYP1B1.采用多种机器学习、分子对接和 ADMET 筛选方法鉴定 CYP1B1 的选择性抑制剂。
J Biomol Struct Dyn. 2022 Oct;40(17):7975-7990. doi: 10.1080/07391102.2021.1905552. Epub 2021 Mar 26.
7
PySmash: Python package and individual executable program for representative substructure generation and application.PySmash:用于代表性子结构生成和应用的 Python 包和独立可执行程序。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab017.
8
A Novel XGBoost Method to Infer the Primary Lesion of 20 Solid Tumor Types From Gene Expression Data.一种从基因表达数据推断20种实体瘤类型原发灶的新型XGBoost方法。
Front Genet. 2021 Feb 3;12:632761. doi: 10.3389/fgene.2021.632761. eCollection 2021.
9
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins.iBLP:一种基于 XGBoost 的生物发光蛋白鉴定预测器。
Comput Math Methods Med. 2021 Jan 7;2021:6664362. doi: 10.1155/2021/6664362. eCollection 2021.
10
ADMET evaluation in drug discovery. 20. Prediction of breast cancer resistance protein inhibition through machine learning.药物发现中的ADMET评估。20. 通过机器学习预测乳腺癌耐药蛋白抑制作用。
J Cheminform. 2020 Mar 5;12(1):16. doi: 10.1186/s13321-020-00421-y.