School of Nursing, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.
School of Nursing, Columbia University, New York, New York, USA.
Res Nurs Health. 2021 Dec;44(6):906-919. doi: 10.1002/nur.22190. Epub 2021 Oct 12.
Data-driven characterization of symptom clusters in chronic conditions is essential for shared cluster detection and physiological mechanism discovery. This study aims to computationally describe symptom documentation from electronic nursing notes and compare symptom clusters among patients diagnosed with four chronic conditions-chronic obstructive pulmonary disease (COPD), heart failure, type 2 diabetes mellitus, and cancer. Nursing notes (N = 504,395; 133,977 patients) were obtained for the 2016 calendar year from a single medical center. We used NimbleMiner, a natural language processing application, to identify the presence of 56 symptoms. We calculated symptom documentation prevalence by note and patient for the corpus. Then, we visually compared documentation for a subset of patients (N = 22,657) diagnosed with COPD (n = 3339), heart failure (n = 6587), diabetes (n = 12,139), and cancer (n = 7269) and conducted multiple correspondence analysis and hierarchical clustering to discover underlying groups of patients who have similar symptom profiles (i.e., symptom clusters) for each condition. As expected, pain was the most frequently documented symptom. All conditions had a group of patients characterized by no symptoms. Shared clusters included cardiovascular symptoms for heart failure and diabetes; pain and other symptoms for COPD, diabetes, and cancer; and a newly-identified cognitive and neurological symptom cluster for heart failure, diabetes, and cancer. Cancer (gastrointestinal symptoms and fatigue) and COPD (mental health symptoms) each contained a unique cluster. In summary, we report both shared and distinct, as well as established and novel, symptom clusters across chronic conditions. Findings support the use of electronic health record-derived notes and NLP methods to study symptoms and symptom clusters to advance symptom science.
数据驱动的慢性疾病症状群特征描述对于共同症状群的检测和生理机制发现至关重要。本研究旨在通过计算方法描述电子护理记录中的症状记录,并比较四种慢性疾病(慢性阻塞性肺疾病(COPD)、心力衰竭、2 型糖尿病和癌症)患者的症状群。从一家医疗中心获取了 2016 年全年的护理记录(N=504395;133977 名患者)。我们使用自然语言处理应用程序 NimbleMiner 来识别 56 种症状的存在。我们按记录和患者计算了症状记录的总体患病率。然后,我们对一小部分患者(N=22657)的记录进行了可视化比较,这些患者分别被诊断为 COPD(n=3339)、心力衰竭(n=6587)、糖尿病(n=12139)和癌症(n=7269),并进行了多元对应分析和层次聚类,以发现每个疾病中具有相似症状特征(即症状群)的潜在患者群体。正如预期的那样,疼痛是记录最频繁的症状。所有疾病都有一组没有症状的患者。共同的症状群包括心力衰竭和糖尿病的心血管症状;COPD、糖尿病和癌症的疼痛和其他症状;以及心力衰竭、糖尿病和癌症中新发现的认知和神经症状群。癌症(胃肠道症状和疲劳)和 COPD(心理健康症状)各有一个独特的症状群。总之,我们报告了慢性疾病中既有共同的、又有独特的、既有已建立的、又有新颖的症状群。研究结果支持使用电子健康记录记录和 NLP 方法来研究症状和症状群,以推进症状科学。