利用大语言模型对临床开发中的方案偏离进行高级灵活标注

Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.

作者信息

Zou Min, Popko Leszek, Gaudio Michelle

机构信息

F.Hoffmann-La Roche AG, 4070, Basel, Switzerland.

Hoffmann-La Roche Limited, 7070 Mississauga Road, Mississauga, ON, L5N 5M8, Canada.

出版信息

Ther Innov Regul Sci. 2025 May 13. doi: 10.1007/s43441-025-00785-z.

DOI:10.1007/s43441-025-00785-z

PMID:40360901

Abstract

BACKGROUND

As described in ICH E3 Q&A R1 (International Council for Harmonisation. E3: Structure and content of clinical study reports-questions and answers (R1). 6 July 2012. Available from: https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf ): "A protocol deviation (PD) is any change, divergence, or departure from the study design or procedures defined in the protocol". A problematic area in human subject protection is the wide divergence among institutions, sponsors, investigators and IRBs regarding the definition of and the procedures for reviewing PDs. Despite industry initiatives like TransCelerate's holistic approach [Galuchie et al. in Ther Innov Regul Sci 55:733-742, 2021], systematic trending and identification of impactful PDs remains limited. Traditional Natural Language Processing (NLP) methods are often cumbersome to implement, requiring extensive feature engineering and model tuning. However, the rise of Large Language Models (LLMs) has revolutionised text classification, enabling more accurate, nuanced, and context-aware solutions [Nguyen P. Test classification in the age of LLMs. 2024. Available from: https://blog.redsift.com/author/phong/ ]. An automated classification solution that enables efficient, flexible, and targeted PD classification is currently lacking.

METHODS

We developed a novel approach using a large language model (LLM), Meta Llama2 [Meta. Llama 2: Open source, free for research and commercial use. 2023. Available from: https://www.llama.com/llama2/ ] with a tailored prompt to classify free-text PDs from Roches' PD management system. The model outputs were analysed to identify trends and assess risks across clinical programs, supporting human decision-making. This method offers a generalisable framework for developing prompts and integrating data to address similar challenges in clinical development.

RESULT

This approach flagged over 80% of PDs potentially affecting disease progression assessment, enabling expert review. Compared to months of manual analysis, this automated method produced actionable insights in minutes. The solution also highlighted gaps in first-line controls, supporting process improvement and better accuracy in disease progression handling during trials.

摘要

背景

如国际人用药品注册技术协调会（ICH）E3问答R1（国际人用药品注册技术协调会。E3：临床研究报告的结构和内容——问答（R1）。2012年7月6日。可从以下网址获取：https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf）所述：“方案偏离（PD）是指与方案中定义的研究设计或程序的任何变化、偏离或背离”。在人类受试者保护方面，一个存在问题的领域是各机构、申办者、研究者和机构审查委员会（IRB）在方案偏离的定义和审查程序方面存在很大差异。尽管有像TransCelerate的整体方法这样的行业举措[Galuchie等人，《治疗创新与监管科学》55：733 - 742，2021]，但对有影响的方案偏离进行系统的趋势分析和识别仍然有限。传统的自然语言处理（NLP）方法实施起来通常很繁琐，需要大量的特征工程和模型调整。然而，大语言模型（LLM）的兴起彻底改变了文本分类，实现了更准确、细致入微和上下文感知的解决方案[Nguyen P.《大语言模型时代的测试分类》。2024年。可从以下网址获取：https://blog.redsift.com/author/phong/]。目前缺乏一种能够实现高效、灵活和有针对性的方案偏离分类的自动化分类解决方案。

方法

我们开发了一种新颖的方法，使用大语言模型（LLM）Meta Llama2[Meta。Llama 2：开源，可免费用于研究和商业用途。2023年。可从以下网址获取：https://www.llama.com/llama2/]以及一个定制的提示，对罗氏方案偏离管理系统中的自由文本方案偏离进行分类。对模型输出进行分析，以识别临床项目中的趋势并评估风险，为人类决策提供支持。这种方法为开发提示和整合数据提供了一个可推广的框架，以应对临床开发中的类似挑战。

结果

这种方法标记出了超过80%可能影响疾病进展评估的方案偏离，以便进行专家审查。与数月的人工分析相比，这种自动化方法在几分钟内就能产生可采取行动的见解。该解决方案还突出了一线控制中的差距，支持在试验期间改进流程并提高疾病进展处理的准确性。

相似文献

Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.利用大语言模型对临床开发中的方案偏离进行高级灵活标注

Ther Innov Regul Sci. 2025 May 13. doi: 10.1007/s43441-025-00785-z.

Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理：一项网络荟萃分析。

Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.

Impact of residual disease as a prognostic factor for survival in women with advanced epithelial ovarian cancer after primary surgery.原发性手术后晚期上皮性卵巢癌患者残留病灶对生存预后的影响。

Cochrane Database Syst Rev. 2022 Sep 26;9(9):CD015048. doi: 10.1002/14651858.CD015048.pub2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤

Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.

Interventions for promoting habitual exercise in people living with and beyond cancer.促进癌症患者及康复者进行习惯性锻炼的干预措施。

Cochrane Database Syst Rev. 2018 Sep 19;9(9):CD010192. doi: 10.1002/14651858.CD010192.pub3.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Lenvatinib plus pembrolizumab for untreated advanced renal cell carcinoma: a systematic review and cost-effectiveness analysis.仑伐替尼联合帕博利珠单抗治疗未经治疗的晚期肾细胞癌：系统评价和成本效果分析。

Health Technol Assess. 2024 Aug;28(49):1-190. doi: 10.3310/TRRM4238.

Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测：基于放射学报告的多中心方法学研究

J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.

本文引用的文献

Protocol Deviations: A Holistic Approach from Defining to Reporting.方案偏离：从定义到报告的整体方法。

Ther Innov Regul Sci. 2021 Jul;55(4):733-742. doi: 10.1007/s43441-021-00269-w. Epub 2021 Mar 29.

Text Classification for Clinical Trial Operations: Evaluation and Comparison of Natural Language Processing Techniques.临床试验操作的文本分类：自然语言处理技术的评估与比较。

Ther Innov Regul Sci. 2021 Mar;55(2):447-453. doi: 10.1007/s43441-020-00236-x. Epub 2020 Oct 30.

New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1).实体瘤新的疗效评价标准：修订的RECIST指南（第1.1版）

Eur J Cancer. 2009 Jan;45(2):228-47. doi: 10.1016/j.ejca.2008.10.026.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用大语言模型对临床开发中的方案偏离进行高级灵活标注

Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULT

背景

方法

结果

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献