Comito Carmela, Forestiero Agostino, Macrì Davide, Metlichin Elisabetta, Giusti Gian Domenico, Ramacciati Nicola
Institute for High-Performance Computing and Networking, National Research Council, Italy.
Institute for High-Performance Computing and Networking, National Research Council, Italy.
Int J Med Inform. 2025 Nov;203:106002. doi: 10.1016/j.ijmedinf.2025.106002. Epub 2025 Jun 6.
Chronic pain is a pervasive healthcare challenge with profound implications for patient well-being, clinical decision-making, and resource allocation. Traditional detection methods often rely on subjective assessments and manual documentation review, which can be time-consuming and unpredictable. Integrating Artificial Intelligence (AI) into healthcare offers a promising approach to enhance chronic pain management through automated and standardized text analysis. This study examines the use of AI in detecting chronic pain from Italian clinical notes. We leverage machine learning (ML) and natural language processing (NLP) techniques to better understand how chronic pain is documented, thereby enabling efficient, data-driven solutions in nursing and medical practice.
METHODS & MATERIALS: We trained XGBoost, Gradient Boosting (GBM), and BERT-based models (BioBit, bert-base-italian-xxl) on 1,008 annotated Italian clinical notes. Input texts were encoded using TF-IDF, Word2Vec, or FastText for tree-based models and tokenized for transformers. While models were trained on full notes, evaluation was performed on fragmented text to simulate realistic usage. Bayesian optimization and stratified cross-validation over 30 trials ensured robust hyperparameter tuning and performance estimates.
Our AI-based approach achieved high overall accuracy. In particular, XGBoost with TF-IDF embeddings yielded the best performance, reaching an F1-score of 0.92 ± 0.01, with precision at 94%, sensitivity at 91%, and specificity at 93%. The chronic pain notes contained fewer total words (73.91 vs. 119.86, p = 0.0021) and unique words (57.27 vs. 92.78, p = 0.0006) than non-chronic pain notes, underscoring the significance of concise, keyword-rich clinical documentation.
Our findings demonstrate the effectiveness of AI in identifying chronic pain cases from fragmentary clinical notes. By focusing on concise, keyword-oriented text, this work establishes a solid baseline for domain-specific NLP approaches in healthcare. The proposed method reduces the burden of manual review, facilitates real-time decision support, and may standardize chronic pain assessment processes. Furthermore, we plan to explore new embedding techniques specifically designed for short, context-limited clinical notes, where dynamic contextual models (e.g., BERT) often encounter challenges due to insufficient extended textual context.
慢性疼痛是一个普遍存在的医疗保健挑战,对患者的健康、临床决策和资源分配有着深远影响。传统的检测方法通常依赖主观评估和人工文档审查,这可能既耗时又不可预测。将人工智能(AI)整合到医疗保健中提供了一种有前景的方法,可通过自动化和标准化的文本分析来加强慢性疼痛管理。本研究考察了AI在从意大利语临床记录中检测慢性疼痛方面的应用。我们利用机器学习(ML)和自然语言处理(NLP)技术来更好地理解慢性疼痛是如何记录的,从而在护理和医疗实践中实现高效的数据驱动解决方案。
我们在1008份带注释的意大利语临床记录上训练了XGBoost、梯度提升(GBM)和基于BERT的模型(BioBit、bert-base-italian-xxl)。对于基于树的模型,输入文本使用TF-IDF、Word2Vec或FastText进行编码,对于变换器模型则进行分词。虽然模型是在完整记录上训练的,但评估是在片段化文本上进行的,以模拟实际使用情况。通过30次试验的贝叶斯优化和分层交叉验证确保了稳健的超参数调整和性能估计。
我们基于AI的方法实现了较高的总体准确率。特别是,带有TF-IDF嵌入的XGBoost表现最佳,F1分数达到0.92±0.01,精确率为94%,敏感度为91%,特异度为93%。与非慢性疼痛记录相比,慢性疼痛记录的总字数(73.91对119.86,p = 0.0021)和独特词汇数(57.27对92.78,p = 0.0006)更少,这突出了简洁且富含关键词的临床文档的重要性。
我们的研究结果证明了AI在从片段化临床记录中识别慢性疼痛病例方面的有效性。通过关注简洁的、以关键词为导向的文本,这项工作为医疗保健领域特定的NLP方法奠定了坚实的基础。所提出的方法减轻了人工审查的负担,促进了实时决策支持,并可能使慢性疼痛评估过程标准化。此外,我们计划探索专门为简短的、上下文有限的临床记录设计的新嵌入技术,在这种情况下,动态上下文模型(如BERT)由于扩展文本上下文不足而经常遇到挑战。