Suppr超能文献

利用自然语言处理技术自动对偏头痛或丛集性头痛患者的书面自述进行分类。

Using natural language processing to automatically classify written self-reported narratives by patients with migraine or cluster headache.

机构信息

Department of Neurology, Ghent University Hospital, Corneel Heymanslaan 10, 9000, Ghent, Belgium.

Department of Basic and Applied Medical Sciences, Faculty of Medicine and Health Sciences, Ghent University, Corneel Heymanslaan 10, 9000, Ghent, Belgium.

出版信息

J Headache Pain. 2022 Sep 30;23(1):129. doi: 10.1186/s10194-022-01490-0.

Abstract

BACKGROUND

Headache medicine is largely based on detailed history taking by physicians analysing patients' descriptions of headache. Natural language processing (NLP) structures and processes linguistic data into quantifiable units. In this study, we apply these digital techniques on self-reported narratives by patients with headache disorders to research the potential of analysing and automatically classifying human-generated text and information extraction in clinical contexts.

METHODS

A prospective cross-sectional clinical trial collected self-reported narratives on headache disorders from participants with either migraine or cluster headache. NLP was applied for the analysis of lexical, semantic and thematic properties of the texts. Machine learning (ML) algorithms were applied to classify the descriptions of headache attacks from individual participants into their correct group (migraine versus cluster headache).

RESULTS

One-hundred and twenty-one patients (81 participants with migraine and 40 participants with cluster headache) provided a self-reported narrative on their headache disorder. Lexical analysis of this text corpus resulted in several specific key words per diagnostic group (cluster headache: Dutch (nl): "oog" | English (en): "eye", nl: "pijn" | en: "pain" and nl: "terug" | en: "back/to come back"; migraine: nl: "hoofdpijn" | en: "headache", nl: "stress" | en: "stress" and nl: "misselijkheid" | en: "nausea"). Thematic and sentiment analysis of text revealed largely negative sentiment in texts by both patients with migraine and cluster headache. Logistic regression and support vector machine algorithms with different feature groups performed best for the classification of attack descriptions (with F1-scores for detecting cluster headache varying between 0.82 and 0.86) compared to naïve Bayes classifiers.

CONCLUSIONS

Differences in lexical choices between patients with migraine and cluster headache are detected with NLP and are congruent with domain expert knowledge of the disorders. Our research shows that ML algorithms have potential to classify patients' self-reported narratives of migraine or cluster headache with good performance. NLP shows its capability to discern relevant linguistic aspects in narratives from patients with different headache disorders and demonstrates relevance in clinical information extraction. The potential benefits on the classification performance of larger datasets and neural NLP methods can be investigated in the future.

TRIAL REGISTRATION

This study was registered with clinicaltrials.gov with ID NCT05377437.

摘要

背景

头痛药物主要基于医生通过分析患者对头痛的描述来进行详细的病史采集。自然语言处理 (NLP) 将语言数据构建为可量化的单元。在这项研究中,我们将这些数字技术应用于头痛障碍患者的自我报告叙述中,以研究分析和自动分类人类生成的文本以及在临床环境中进行信息提取的潜力。

方法

一项前瞻性横断面临床试验从偏头痛或丛集性头痛患者中收集了头痛障碍的自我报告叙述。NLP 用于分析文本的词汇、语义和主题属性。机器学习 (ML) 算法用于将个体参与者的头痛发作描述分类为正确的组(偏头痛与丛集性头痛)。

结果

121 名患者(81 名偏头痛患者和 40 名丛集性头痛患者)提供了关于他们头痛障碍的自我报告叙述。对该文本语料库的词汇分析导致每个诊断组都有几个特定的关键词(丛集性头痛:荷兰语 (nl):“oog”| 英语 (en):“eye”,nl:“pijn”| en:“pain”和 nl:“terug”| en:“back/to come back”;偏头痛:nl:“hoofdpijn”| en:“headache”,nl:“stress”| en:“stress”和 nl:“misselijkheid”| en:“nausea”)。对文本的主题和情感分析表明,偏头痛和丛集性头痛患者的文本大都表达了负面情绪。与朴素贝叶斯分类器相比,具有不同特征组的逻辑回归和支持向量机算法在检测簇发性头痛的描述分类方面表现最佳(检测簇发性头痛的 F1 分数在 0.82 到 0.86 之间变化)。

结论

偏头痛和丛集性头痛患者之间的词汇选择差异通过 NLP 检测到,与疾病领域专家的知识相符。我们的研究表明,ML 算法具有良好的性能,可以对偏头痛或丛集性头痛患者的自我报告叙述进行分类。NLP 展示了在分析来自不同头痛障碍患者的叙述时辨别相关语言方面的能力,并在临床信息提取方面具有相关性。可以在未来研究更大的数据集和神经 NLP 方法对分类性能的潜在好处。

试验注册

这项研究在 clinicaltrials.gov 上注册,编号为 NCT05377437。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/263b/9524092/5c15bfb78196/10194_2022_1490_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验