Hansen Lasse, Enevoldsen Kenneth, Bernstorff Martin, Perfalk Erik, Danielsen Andreas A, Nielbo Kristoffer L, Østergaard Søren D
Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark.
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
Acta Neuropsychiatr. 2023 Aug 25;37:e16. doi: 10.1017/neu.2023.46.
Natural language processing (NLP) methods hold promise for improving clinical prediction by utilising information otherwise hidden in the clinical notes of electronic health records. However, clinical practice - as well as the systems and databases in which clinical notes are recorded and stored - change over time. As a consequence, the content of clinical notes may also change over time, which could degrade the performance of prediction models. Despite its importance, the stability of clinical notes over time has rarely been tested.
The lexical stability of clinical notes from the Psychiatric Services of the Central Denmark Region in the period from January 1, 2011, to November 22, 2021 (a total of 14,811,551 clinical notes describing 129,570 patients) was assessed by quantifying sentence length, readability, syntactic complexity and clinical content. Changepoint detection models were used to estimate potential changes in these metrics.
We find lexical stability of the clinical notes over time, with minor deviations during the COVID-19 pandemic. Out of 2988 data points, 17 possible changepoints (corresponding to 0.6%) were detected. The majority of these were related to the discontinuation of a specific note type.
We find lexical and syntactic stability of clinical notes from psychiatric services over time, which bodes well for the use of NLP for predictive modelling in clinical psychiatry.
自然语言处理(NLP)方法有望通过利用电子健康记录临床笔记中隐藏的信息来改善临床预测。然而,临床实践以及记录和存储临床笔记的系统与数据库会随时间变化。因此,临床笔记的内容也可能随时间改变,这可能会降低预测模型的性能。尽管其很重要,但临床笔记随时间的稳定性很少得到检验。
通过量化句子长度、可读性、句法复杂性和临床内容,评估了2011年1月1日至2021年11月22日期间丹麦中部地区精神科服务的临床笔记的词汇稳定性(总共14,811,551份临床笔记,描述了129,570名患者)。使用变点检测模型来估计这些指标的潜在变化。
我们发现临床笔记随时间具有词汇稳定性,在新冠疫情期间有轻微偏差。在2988个数据点中,检测到17个可能的变点(占0.6%)。其中大多数与特定笔记类型的停用有关。
我们发现精神科服务的临床笔记在词汇和句法上随时间具有稳定性,这对于在临床精神病学中使用NLP进行预测建模是个好兆头。