Baroutsou Vasiliki, Cerqueira Gonzalez Pena Rodrigo, Schweighoffer Reka, Caiata-Zufferey Maria, Kim Sue, Hesse-Biber Sharlene, Ciorba Florina M, Lauer Gerhard, Katapodi Maria
Department of Clinical Research, University of Basel, Basel, Switzerland.
Center for Data Analytics, University of Basel, Basel, Switzerland.
JMIR Form Res. 2023 Jan 19;7:e38399. doi: 10.2196/38399.
In health care research, patient-reported opinions are a critical element of personalized medicine and contribute to optimal health care delivery. The importance of integrating natural language processing (NLP) methods to extract patient-reported opinions has been gradually acknowledged over the past years. One form of NLP is sentiment analysis, which extracts and analyses information by detecting feelings (thoughts, emotions, attitudes, etc) behind words. Sentiment analysis has become particularly popular following the rise of digital interactions. However, NLP and sentiment analysis in the context of intrafamilial communication for genetic cancer risk is still unexplored. Due to privacy laws, intrafamilial communication is the main avenue to inform at-risk relatives about the pathogenic variant and the possibility of increased cancer risk.
The study examined the role of sentiment in predicting openness of intrafamilial communication about genetic cancer risk associated with hereditary breast and ovarian cancer (HBOC) syndrome.
We used narratives derived from 53 in-depth interviews with individuals from families that harbor pathogenic variants associated with HBOC: first, to quantify openness of communication about cancer risk, and second, to examine the role of sentiment in predicting openness of communication. The interviews were conducted between 2019 and 2021 in Switzerland and South Korea using the same interview guide. We used NLP to extract and quantify textual features to construct a handcrafted lexicon about interpersonal communication of genetic testing results and cancer risk associated with HBOC. Moreover, we examined the role of sentiment in predicting openness of communication using a stepwise linear regression model. To test model accuracy, we used a split-validation set. We measured the performance of the training and testing model using area under the curve, sensitivity, specificity, and root mean square error.
Higher "openness of communication" scores were associated with higher overall net sentiment score of the narrative, higher fear, being single, having nonacademic education, and higher informational support within the family. Our results demonstrate that NLP was highly effective in analyzing unstructured texts from individuals of different cultural and linguistic backgrounds and could also reliably predict a measure of "openness of communication" (area under the curve=0.72) in the context of genetic cancer risk associated with HBOC.
Our study showed that NLP can facilitate assessment of openness of communication in individuals carrying a pathogenic variant associated with HBOC. Findings provided promising evidence that various features from narratives such as sentiment and fear are important predictors of interpersonal communication and self-disclosure in this context. Our approach is promising and can be expanded in the field of personalized medicine and technology-mediated communication.
在医疗保健研究中,患者报告的意见是个性化医疗的关键要素,有助于实现最佳医疗服务。在过去几年中,人们逐渐认识到整合自然语言处理(NLP)方法来提取患者报告意见的重要性。NLP的一种形式是情感分析,它通过检测词语背后的情感(思想、情绪、态度等)来提取和分析信息。随着数字互动的兴起,情感分析变得特别流行。然而,在遗传性癌症风险的家庭内部沟通背景下的NLP和情感分析仍未得到探索。由于隐私法,家庭内部沟通是告知高危亲属致病变异和癌症风险增加可能性的主要途径。
本研究探讨了情感在预测与遗传性乳腺癌和卵巢癌(HBOC)综合征相关的遗传性癌症风险的家庭内部沟通开放性方面的作用。
我们使用了对携带与HBOC相关致病变异的家庭中的个体进行的53次深入访谈得出的叙述:首先,量化关于癌症风险的沟通开放性;其次,研究情感在预测沟通开放性方面的作用。访谈于2019年至2021年在瑞士和韩国进行,使用相同的访谈指南。我们使用NLP提取和量化文本特征,以构建一个关于基因检测结果人际沟通和与HBOC相关的癌症风险的手工词典。此外,我们使用逐步线性回归模型研究情感在预测沟通开放性方面的作用。为了测试模型准确性,我们使用了一个分割验证集。我们使用曲线下面积、敏感性、特异性和均方根误差来衡量训练和测试模型的性能。
较高的“沟通开放性”得分与叙述的总体净情感得分较高、恐惧程度较高、单身、接受非学术教育以及家庭内部较高的信息支持相关。我们的结果表明,NLP在分析来自不同文化和语言背景个体的非结构化文本方面非常有效,并且在与HBOC相关的遗传性癌症风险背景下,也能够可靠地预测“沟通开放性”的度量(曲线下面积 = 0.72)。
我们的研究表明,NLP可以促进对携带与HBOC相关致病变异个体的沟通开放性的评估。研究结果提供了有希望的证据,表明诸如情感和恐惧等叙述中的各种特征是这种背景下人际沟通和自我披露的重要预测因素。我们的方法很有前景,可以在个性化医疗和技术介导的沟通领域中得到扩展。