Suppr超能文献

在FDA药品标签中微调BERT以进行自动ADME语义标注,以加强特定产品的指导评估。

Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.

作者信息

Shi Yiwen, Wang Jing, Ren Ping, ValizadehAslani Taha, Zhang Yi, Hu Meng, Liang Hualou

机构信息

College of Computing and Informatics, Drexel University, Philadelphia, PA, United States.

Office of Research and Standards, Office of Generic Drugs, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, United States.

出版信息

J Biomed Inform. 2023 Feb;138:104285. doi: 10.1016/j.jbi.2023.104285. Epub 2023 Jan 9.

Abstract

Product-specific guidances (PSGs) recommended by the United States Food and Drug Administration (FDA) are instrumental to promote and guide generic drug product development. To assess a PSG, the FDA assessor needs to take extensive time and effort to manually retrieve supportive drug information of absorption, distribution, metabolism, and excretion (ADME) from the reference listed drug labeling. In this work, we leveraged the state-of-the-art pre-trained language models to automatically label the ADME paragraphs in the pharmacokinetics section from the FDA-approved drug labeling to facilitate PSG assessment. We applied a transfer learning approach by fine-tuning the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to develop a novel application of ADME semantic labeling, which can automatically retrieve ADME paragraphs from drug labeling instead of manual work. We demonstrate that fine-tuning the pre-trained BERT model can outperform conventional machine learning techniques, achieving up to 12.5% absolute F1 improvement. To our knowledge, we were the first to successfully apply BERT to solve the ADME semantic labeling task. We further assessed the relative contribution of pre-training and fine-tuning to the overall performance of the BERT model in the ADME semantic labeling task using a series of analysis methods, such as attention similarity and layer-based ablations. Our analysis revealed that the information learned via fine-tuning is focused on task-specific knowledge in the top layers of the BERT, whereas the benefit from the pre-trained BERT model is from the bottom layers.

摘要

美国食品药品监督管理局(FDA)推荐的特定产品指南(PSG)有助于促进和指导仿制药产品的开发。为了评估一份PSG,FDA评估人员需要花费大量时间和精力从参比上市药品标签中手动检索吸收、分布、代谢和排泄(ADME)的支持性药物信息。在这项工作中,我们利用了最先进的预训练语言模型来自动标记FDA批准的药品标签中药代动力学部分的ADME段落,以促进PSG评估。我们通过微调预训练的来自变换器的双向编码器表征(BERT)模型应用迁移学习方法,开发了一种ADME语义标记的新应用,它可以从药品标签中自动检索ADME段落,而无需人工操作。我们证明,微调预训练的BERT模型优于传统的机器学习技术,可以实现高达12.5%的绝对F1改进。据我们所知,我们是第一个成功应用BERT解决ADME语义标记任务的。我们还使用了一系列分析方法,如注意力相似度和基于层的消融分析,进一步评估了预训练和微调对BERT模型在ADME语义标记任务中整体性能的相对贡献。我们的分析表明,通过微调学到的信息集中在BERT顶层的特定任务知识上,而预训练的BERT模型的好处来自底层。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验