在FDA药品标签中微调BERT以进行自动ADME语义标注，以加强特定产品的指导评估。

Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.

作者信息

Shi Yiwen, Wang Jing, Ren Ping, ValizadehAslani Taha, Zhang Yi, Hu Meng, Liang Hualou

机构信息

College of Computing and Informatics, Drexel University, Philadelphia, PA, United States.

Office of Research and Standards, Office of Generic Drugs, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, United States.

出版信息

J Biomed Inform. 2023 Feb;138:104285. doi: 10.1016/j.jbi.2023.104285. Epub 2023 Jan 9.

DOI:10.1016/j.jbi.2023.104285

PMID:36632860

Abstract

Product-specific guidances (PSGs) recommended by the United States Food and Drug Administration (FDA) are instrumental to promote and guide generic drug product development. To assess a PSG, the FDA assessor needs to take extensive time and effort to manually retrieve supportive drug information of absorption, distribution, metabolism, and excretion (ADME) from the reference listed drug labeling. In this work, we leveraged the state-of-the-art pre-trained language models to automatically label the ADME paragraphs in the pharmacokinetics section from the FDA-approved drug labeling to facilitate PSG assessment. We applied a transfer learning approach by fine-tuning the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to develop a novel application of ADME semantic labeling, which can automatically retrieve ADME paragraphs from drug labeling instead of manual work. We demonstrate that fine-tuning the pre-trained BERT model can outperform conventional machine learning techniques, achieving up to 12.5% absolute F1 improvement. To our knowledge, we were the first to successfully apply BERT to solve the ADME semantic labeling task. We further assessed the relative contribution of pre-training and fine-tuning to the overall performance of the BERT model in the ADME semantic labeling task using a series of analysis methods, such as attention similarity and layer-based ablations. Our analysis revealed that the information learned via fine-tuning is focused on task-specific knowledge in the top layers of the BERT, whereas the benefit from the pre-trained BERT model is from the bottom layers.

摘要

美国食品药品监督管理局（FDA）推荐的特定产品指南（PSG）有助于促进和指导仿制药产品的开发。为了评估一份PSG，FDA评估人员需要花费大量时间和精力从参比上市药品标签中手动检索吸收、分布、代谢和排泄（ADME）的支持性药物信息。在这项工作中，我们利用了最先进的预训练语言模型来自动标记FDA批准的药品标签中药代动力学部分的ADME段落，以促进PSG评估。我们通过微调预训练的来自变换器的双向编码器表征（BERT）模型应用迁移学习方法，开发了一种ADME语义标记的新应用，它可以从药品标签中自动检索ADME段落，而无需人工操作。我们证明，微调预训练的BERT模型优于传统的机器学习技术，可以实现高达12.5%的绝对F1改进。据我们所知，我们是第一个成功应用BERT解决ADME语义标记任务的。我们还使用了一系列分析方法，如注意力相似度和基于层的消融分析，进一步评估了预训练和微调对BERT模型在ADME语义标记任务中整体性能的相对贡献。我们的分析表明，通过微调学到的信息集中在BERT顶层的特定任务知识上，而预训练的BERT模型的好处来自底层。

相似文献

Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.在FDA药品标签中微调BERT以进行自动ADME语义标注，以加强特定产品的指导评估。

J Biomed Inform. 2023 Feb;138:104285. doi: 10.1016/j.jbi.2023.104285. Epub 2023 Jan 9.

Information Extraction From FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing.利用自然语言处理技术从美国食品药品监督管理局（FDA）药品标签中提取信息以加强特定产品指导评估

Front Res Metr Anal. 2021 Jun 10;6:670006. doi: 10.3389/frma.2021.670006. eCollection 2021.

A Fine-Tuned Bidirectional Encoder Representations From Transformers Model for Food Named-Entity Recognition: Algorithm Development and Validation.基于 Transformer 的双向编码器表示模型的精细调整在食品命名实体识别中的应用：算法开发与验证。

J Med Internet Res. 2021 Aug 9;23(8):e28229. doi: 10.2196/28229.

PharmBERT: a domain-specific BERT model for drug labels.PharmBERT：一种针对药物标签的特定领域 BERT 模型。

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad226.

CACER: Clinical concept Annotations for Cancer Events and Relations.CACER：癌症事件与关系的临床概念注释。

J Am Med Inform Assoc. 2024 Nov 1;31(11):2583-2594. doi: 10.1093/jamia/ocae231.

Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech.基于语音比较预训练模型和基于特征的模型对阿尔茨海默病的预测

Front Aging Neurosci. 2021 Apr 27;13:635945. doi: 10.3389/fnagi.2021.635945. eCollection 2021.

Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications.基于像 Transformer 这样的大型语言模型的双向编码器表示在患者安全和药物警戒中的应用：因果推断影响的综合评估。

Exp Biol Med (Maywood). 2023 Nov;248(21):1908-1917. doi: 10.1177/15353702231215895. Epub 2023 Dec 12.

Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN（带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合）模型的医患对话多标签分类：命名实体研究

JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.

GT-Finder: Classify the family of glucose transporters with pre-trained BERT language models.GT-Finder：使用预训练的 BERT 语言模型对葡萄糖转运蛋白家族进行分类。

Comput Biol Med. 2021 Apr;131:104259. doi: 10.1016/j.compbiomed.2021.104259. Epub 2021 Feb 7.

引用本文的文献

AI-driven pharmacovigilance: Enhancing adverse drug reaction detection with deep learning and NLP.人工智能驱动的药物警戒：利用深度学习和自然语言处理增强药物不良反应检测

MethodsX. 2025 Jun 23;15:103460. doi: 10.1016/j.mex.2025.103460. eCollection 2025 Dec.

Computational insights into rational design and virtual screening of pyrazolopyrimidine derivatives targeting Janus kinase 3 (JAK3).针对靶向Janus激酶3（JAK3）的吡唑并嘧啶衍生物的合理设计与虚拟筛选的计算洞察

Front Chem. 2024 Aug 12;12:1425220. doi: 10.3389/fchem.2024.1425220. eCollection 2024.

Automatic text classification of drug-induced liver injury using document-term matrix and XGBoost.使用文档-词矩阵和XGBoost对药物性肝损伤进行自动文本分类

Front Artif Intell. 2024 Jun 3;7:1401810. doi: 10.3389/frai.2024.1401810. eCollection 2024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在FDA药品标签中微调BERT以进行自动ADME语义标注，以加强特定产品的指导评估。

Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献