• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

药代动力学/药效学基准测试:利用大型语言模型增强。

PharmaBench: Enhancing ADMET benchmarks with large language models.

机构信息

MindRank AI, Hangzhou, Zhejiang, China.

National Heart and Lung Institute, Imperial College London, London, SW7 2AZ, UK.

出版信息

Sci Data. 2024 Sep 10;11(1):985. doi: 10.1038/s41597-024-03793-0.

DOI:10.1038/s41597-024-03793-0
PMID:39256394
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11387650/
Abstract

Accurately predicting ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties early in drug development is essential for selecting compounds with optimal pharmacokinetics and minimal toxicity. Existing ADMET-related benchmark sets are limited in utility due to their small dataset sizes and the lack of representation of compounds used in drug discovery projects. These shortcomings hinder their application in model building for drug discovery. To address this issue, we propose a multi-agent data mining system based on Large Language Models that effectively identifies experimental conditions within 14,401 bioassays. This approach facilitates merging entries from different sources, culminating in the creation of PharmaBench. Additionally, we have developed a data processing workflow to integrate data from various sources, resulting in 156,618 raw entries. Through this workflow, we constructed PharmaBench, a comprehensive benchmark set for ADMET properties, which comprises eleven ADMET datasets and 52,482 entries. This benchmark set is designed to serve as an open-source dataset for the development of AI models relevant to drug discovery projects.

摘要

准确预测 ADMET(吸收、分布、代谢、排泄和毒性)性质在药物开发的早期至关重要,因为这有助于选择具有最佳药代动力学和最小毒性的化合物。现有的 ADMET 相关基准集由于其数据集规模较小且缺乏药物发现项目中使用的化合物的代表性,因此其用途有限。这些缺点阻碍了它们在药物发现模型构建中的应用。为了解决这个问题,我们提出了一个基于大型语言模型的多代理数据挖掘系统,该系统可以有效地识别 14401 项生物测定实验中的实验条件。这种方法有助于合并来自不同来源的数据项,最终创建了 PharmaBench。此外,我们还开发了一个数据处理工作流程,用于整合来自不同来源的数据,从而产生了 156618 个原始数据项。通过这个工作流程,我们构建了 PharmaBench,这是一个用于 ADMET 性质的综合性基准集,包含 11 个 ADMET 数据集和 52482 个数据项。这个基准集旨在作为一个开源数据集,用于开发与药物发现项目相关的 AI 模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/594ee07ea417/41597_2024_3793_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/2bb55c14b98c/41597_2024_3793_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/3f784fd1e6b6/41597_2024_3793_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/88088df2373c/41597_2024_3793_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/0026c05b995d/41597_2024_3793_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/195eb6300f9a/41597_2024_3793_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/8b4bd5d9f053/41597_2024_3793_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/594ee07ea417/41597_2024_3793_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/2bb55c14b98c/41597_2024_3793_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/3f784fd1e6b6/41597_2024_3793_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/88088df2373c/41597_2024_3793_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/0026c05b995d/41597_2024_3793_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/195eb6300f9a/41597_2024_3793_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/8b4bd5d9f053/41597_2024_3793_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9b6/11387650/594ee07ea417/41597_2024_3793_Fig7_HTML.jpg

相似文献

1
PharmaBench: Enhancing ADMET benchmarks with large language models.药代动力学/药效学基准测试:利用大型语言模型增强。
Sci Data. 2024 Sep 10;11(1):985. doi: 10.1038/s41597-024-03793-0.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Short-Term Memory Impairment短期记忆障碍
4
AI-Driven Antimicrobial Peptide Discovery: Mining and Generation.人工智能驱动的抗菌肽发现:挖掘与生成
Acc Chem Res. 2025 Jun 17;58(12):1831-1846. doi: 10.1021/acs.accounts.0c00594. Epub 2025 Jun 3.
5
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
6
Electrophoresis电泳
7
MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.马克 VCID 脑小血管联盟:一、入组、临床、液体方案。
Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.
8
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
9
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
10
Neonatal Nurses' Understanding of the Factors That Enhance and Hinder Early Communication Between Preterm Infants and Their Parents: A Narrative Inquiry Study.新生儿护士对促进和阻碍早产儿与其父母早期沟通因素的理解:一项叙事探究研究。
Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70093. doi: 10.1111/1460-6984.70093.

引用本文的文献

1
Harnessing the Therapeutic Potential of Pomegranate Peel-Derived Bioactive Compounds in Pancreatic Cancer: A Computational Approach.利用石榴皮衍生生物活性化合物在胰腺癌中的治疗潜力:一种计算方法。
Pharmaceuticals (Basel). 2025 Jun 15;18(6):896. doi: 10.3390/ph18060896.
2
Artificial intelligence in drug development: reshaping the therapeutic landscape.药物研发中的人工智能:重塑治疗格局。
Ther Adv Drug Saf. 2025 Feb 24;16:20420986251321704. doi: 10.1177/20420986251321704. eCollection 2025.
3
In-silico evaluation of diffractaic acid as novel anti-diabetic inhibitor against dipeptidyl peptidase IV enzyme.

本文引用的文献

1
An extensive benchmark study on biomedical text generation and mining with ChatGPT.一项关于使用ChatGPT进行生物医学文本生成和挖掘的广泛基准研究。
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad557.
2
Segment anything model for medical image analysis: An experimental study.用于医学图像分析的分割模型:一项实验研究。
Med Image Anal. 2023 Oct;89:102918. doi: 10.1016/j.media.2023.102918. Epub 2023 Aug 2.
3
Double-head transformer neural network for molecular property prediction.用于分子性质预测的双头变压器神经网络。
作为新型抗糖尿病抑制剂针对二肽基肽酶IV酶的衍射酸的计算机模拟评估
In Silico Pharmacol. 2025 Feb 10;13(1):24. doi: 10.1007/s40203-025-00321-9. eCollection 2025.
J Cheminform. 2023 Feb 23;15(1):27. doi: 10.1186/s13321-023-00700-4.
4
FP-GNN: a versatile deep learning architecture for enhanced molecular property prediction.FP-GNN:一种用于增强分子性质预测的多功能深度学习架构。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac408.
5
Boosting the predictive performance with aqueous solubility dataset curation.通过对水溶解度数据集的整理来提高预测性能。
Sci Data. 2022 Mar 3;9(1):71. doi: 10.1038/s41597-022-01154-3.
6
Eyes on Lipinski's Rule of Five: A New "Rule of Thumb" for Physicochemical Design Space of Ophthalmic Drugs.关注 Lipinski 的五规则:眼科药物物理化学设计空间的新“经验法则”。
J Ocul Pharmacol Ther. 2022 Jan-Feb;38(1):43-55. doi: 10.1089/jop.2021.0069. Epub 2021 Dec 14.
7
A curated diverse molecular database of blood-brain barrier permeability with chemical descriptors.具有化学描述符的血脑屏障通透性的多样化分子数据库。
Sci Data. 2021 Oct 29;8(1):289. doi: 10.1038/s41597-021-01069-5.
8
An effective self-supervised framework for learning expressive molecular global representations to drug discovery.用于药物发现的学习表达性分子全局表示的有效自监督框架。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab109.
9
Machine learning with physicochemical relationships: solubility prediction in organic solvents and water.基于物理化学关系的机器学习:有机溶剂和水中的溶解度预测。
Nat Commun. 2020 Nov 13;11(1):5753. doi: 10.1038/s41467-020-19594-z.
10
SuperPlots: Communicating reproducibility and variability in cell biology.超图:展示细胞生物学中的可重复性和可变性。
J Cell Biol. 2020 Jun 1;219(6). doi: 10.1083/jcb.202001064.