• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

临床试验操作的文本分类:自然语言处理技术的评估与比较。

Text Classification for Clinical Trial Operations: Evaluation and Comparison of Natural Language Processing Techniques.

机构信息

Janssen Research & Development, LLC, 1400 McKean Rd, Spring House, PA, 19477, USA.

出版信息

Ther Innov Regul Sci. 2021 Mar;55(2):447-453. doi: 10.1007/s43441-020-00236-x. Epub 2020 Oct 30.

DOI:10.1007/s43441-020-00236-x
PMID:33125616
Abstract

The ability to detect patterns and trends across protocol deviations (PDs) is key to ensure high data quality and sufficient oversight of patient safety. In clinical trial operations, some business processes and work instructions limit efficient protocol deviation trending because a majority of protocol deviations are left unclassified. When this occurs, it restricts clinical teams from determining systemic issues or signals in the data. The unstructured text in protocol deviation descriptions is an important component of trial operation knowledge. Natural language processing (NLP) can make protocol deviation descriptions more accessible and can support information extraction and trending analysis. This paper reviews how the natural language processing techniques of Term-Frequency Inverse-Document-Frequency (TF-IDF) combined with the supervised machine learning model of Support Vector Machines (SVM) and word embedding approaches such as word2vec can be used to categorize/label protocol deviations across multiple therapeutic areas. NLP is a key tool that will lead to more data driven decisions in clinical trial operations.

摘要

发现方案偏离(Protocol deviations,PDs)中的模式和趋势的能力对于确保高质量的数据和充分监督患者安全至关重要。在临床试验运营中,一些业务流程和工作说明会限制方案偏离趋势的效率,因为大多数方案偏离都未分类。发生这种情况时,会限制临床团队确定数据中的系统性问题或信号。方案偏离描述中的非结构化文本是试验运营知识的重要组成部分。自然语言处理(Natural language processing,NLP)可以使方案偏离描述更易于访问,并支持信息提取和趋势分析。本文回顾了如何结合支持向量机(Support Vector Machines,SVM)的监督机器学习模型和词嵌入方法(如 word2vec)的术语频率逆文档频率(Term-Frequency Inverse-Document-Frequency,TF-IDF)自然语言处理技术,用于对多个治疗领域的方案偏离进行分类/标记。NLP 是一种关键工具,将导致临床试验运营中更多的数据驱动决策。

相似文献

1
Text Classification for Clinical Trial Operations: Evaluation and Comparison of Natural Language Processing Techniques.临床试验操作的文本分类:自然语言处理技术的评估与比较。
Ther Innov Regul Sci. 2021 Mar;55(2):447-453. doi: 10.1007/s43441-020-00236-x. Epub 2020 Oct 30.
2
Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.将自然语言处理和机器学习算法集成到放射学报告中的肿瘤反应分类中。
J Digit Imaging. 2018 Apr;31(2):178-184. doi: 10.1007/s10278-017-0027-x.
3
Automated Classification of Free-Text Radiology Reports: Using Different Feature Extraction Methods to Identify Fractures of the Distal Fibula.自动化自由文本放射学报告分类:使用不同的特征提取方法识别腓骨远端骨折。
Rofo. 2023 Aug;195(8):713-719. doi: 10.1055/a-2061-6562. Epub 2023 May 9.
4
Natural language processing of head CT reports to identify intracranial mass effect: CTIME algorithm.通过头部CT报告的自然语言处理识别颅内占位效应:CTIME算法
Am J Emerg Med. 2022 Jan;51:388-392. doi: 10.1016/j.ajem.2021.11.001. Epub 2021 Nov 9.
5
Improved prediction of drug-induced liver injury literature using natural language processing and machine learning methods.使用自然语言处理和机器学习方法改进对药物性肝损伤文献的预测。
Front Genet. 2023 Jul 17;14:1161047. doi: 10.3389/fgene.2023.1161047. eCollection 2023.
6
Natural Language Processing of Radiology Reports in Patients With Hepatocellular Carcinoma to Predict Radiology Resource Utilization.肝细胞癌患者放射学报告的自然语言处理以预测放射学资源利用。
J Am Coll Radiol. 2019 Jun;16(6):840-844. doi: 10.1016/j.jacr.2018.12.004. Epub 2019 Mar 2.
7
Investigating response behavior through TF-IDF and Word2vec text analysis: A case study of PISA 2012 problem-solving process data.通过TF-IDF和Word2vec文本分析研究反应行为:以2012年国际学生评估项目(PISA)解决问题过程数据为例
Heliyon. 2024 Aug 10;10(16):e35945. doi: 10.1016/j.heliyon.2024.e35945. eCollection 2024 Aug 30.
8
Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances.利用机器学习和自然语言处理技术从救护车出勤记录中识别跌倒事件。
Inform Health Soc Care. 2022 Oct 2;47(4):403-413. doi: 10.1080/17538157.2021.2019038. Epub 2021 Dec 30.
9
Analysis of Language Embeddings for Classification of Unstructured Pathology Reports.语言嵌入分析在非结构化病理报告分类中的应用。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2378-2381. doi: 10.1109/EMBC46164.2021.9630347.
10
A Scalable Natural Language Processing for Inferring BT-RADS Categorization from Unstructured Brain Magnetic Resonance Reports.一种可扩展的自然语言处理方法,用于从非结构化的脑部磁共振报告中推断 BT-RADS 分类。
J Digit Imaging. 2020 Dec;33(6):1393-1400. doi: 10.1007/s10278-020-00350-0.

引用本文的文献

1
Enhanced effective convolutional attention network with squeeze-and-excitation inception module for multi-label clinical document classification.基于挤压激励 inception 模块的增强型有效卷积注意力网络用于多标签临床文档分类
Sci Rep. 2025 May 16;15(1):16988. doi: 10.1038/s41598-025-98719-0.
2
Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.利用大语言模型对临床开发中的方案偏离进行高级灵活标注
Ther Innov Regul Sci. 2025 May 13. doi: 10.1007/s43441-025-00785-z.
3
Challenges and opportunities in cancer immunotherapy: a Society for Immunotherapy of Cancer (SITC) strategic vision.

本文引用的文献

1
Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review.电子健康记录中自由文本叙述的症状的自然语言处理:系统评价。
J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379. doi: 10.1093/jamia/ocy173.
2
Assessment and classification of protocol deviations.方案偏差的评估与分类
Perspect Clin Res. 2016 Jul-Sep;7(3):132-6. doi: 10.4103/2229-3485.184817.
3
Protocol deviation and violation.方案偏差与违规
癌症免疫治疗的挑战与机遇:癌症免疫治疗学会(SITC)的战略愿景。
J Immunother Cancer. 2024 Jun 19;12(6):e009063. doi: 10.1136/jitc-2024-009063.
4
An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification.一种改进的深度学习模型:用于中医方剂分类的S-TextBLCNN
Front Genet. 2021 Dec 22;12:807825. doi: 10.3389/fgene.2021.807825. eCollection 2021.
5
Artificial intelligence in clinical and translational science: Successes, challenges and opportunities.人工智能在临床和转化科学中的应用:成功、挑战与机遇。
Clin Transl Sci. 2022 Feb;15(2):309-321. doi: 10.1111/cts.13175. Epub 2021 Oct 30.
Perspect Clin Res. 2012 Jul;3(3):117. doi: 10.4103/2229-3485.100663.
4
Monitoring EMS protocol deviations: a useful quality assurance tool.监测急救医疗服务(EMS)协议偏差:一种有用的质量保证工具。
Ann Emerg Med. 1991 Dec;20(12):1319-24. doi: 10.1016/s0196-0644(05)81074-1.