将非结构化患者叙述和健康保险索赔数据纳入药物警戒中:关于系统性红斑狼疮的患者生成文本的自然语言处理分析。
Incorporating Unstructured Patient Narratives and Health Insurance Claims Data in Pharmacovigilance: Natural Language Processing Analysis of Patient-Generated Texts About Systemic Lupus Erythematosus.
机构信息
Real-World Data Science Department, Chugai Pharmaceutical Co Ltd, Tokyo, Japan.
Risk Communication Department, Chugai Pharmaceutical Co Ltd, Tokyo, Japan.
出版信息
JMIR Public Health Surveill. 2021 Jun 29;7(6):e29238. doi: 10.2196/29238.
BACKGROUND
Gaining insights that cannot be obtained from health care databases from patients has become an important topic in pharmacovigilance.
OBJECTIVE
Our objective was to demonstrate a use case, in which patient-generated data were incorporated in pharmacovigilance, to understand the epidemiology and burden of illness in Japanese patients with systemic lupus erythematosus.
METHODS
We used data on systemic lupus erythematosus, an autoimmune disease that substantially impairs quality of life, from 2 independent data sets. To understand the disease's epidemiology, we analyzed a Japanese health insurance claims database. To understand the disease's burden, we analyzed text data collected from Japanese disease blogs (tōbyōki) written by patients with systemic lupus erythematosus. Natural language processing was applied to these texts to identify frequent patient-level complaints, and term frequency-inverse document frequency was used to explore patient burden during treatment. We explored health-related quality of life based on patient descriptions.
RESULTS
We analyzed data from 4694 and 635 patients with systemic lupus erythematosus in the health insurance claims database and tōbyōki blogs, respectively. Based on health insurance claims data, the prevalence of systemic lupus erythematosus is 107.70 per 100,000 persons. Tōbyōki text data analysis showed that pain-related words (eg, pain, severe pain, arthralgia) became more important after starting treatment. We also found an increase in patients' references to mobility and self-care over time, which indicated increased attention to physical disability due to disease progression.
CONCLUSIONS
A classical medical database represents only a part of a patient's entire treatment experience, and analysis using solely such a database cannot represent patient-level symptoms or patient concerns about treatments. This study showed that analysis of tōbyōki blogs can provide added information on patient-level details, advancing patient-centric pharmacovigilance.
背景
从患者处获取医疗保健数据库中无法获得的见解已成为药物警戒学的一个重要课题。
目的
我们旨在展示一个病例,即将患者生成的数据纳入药物警戒学中,以了解日本系统性红斑狼疮患者的流行病学和疾病负担。
方法
我们使用了来自两个独立数据集的关于系统性红斑狼疮(一种严重影响生活质量的自身免疫性疾病)的数据。为了了解该疾病的流行病学,我们分析了日本健康保险索赔数据库。为了了解该疾病的负担,我们分析了从日本系统性红斑狼疮患者撰写的疾病博客(日记)中收集的文本数据。我们对这些文本应用自然语言处理来识别常见的患者级别的投诉,并使用术语频率-逆文档频率来探索治疗期间的患者负担。我们根据患者描述探索了健康相关的生活质量。
结果
我们分别在健康保险索赔数据库和日记博客中分析了 4694 名和 635 名系统性红斑狼疮患者的数据。基于健康保险索赔数据,系统性红斑狼疮的患病率为每 10 万人中有 107.70 人。日记博客文本数据分析显示,治疗开始后,与疼痛相关的词语(如疼痛、剧痛、关节痛)变得更加重要。我们还发现,随着时间的推移,患者对活动能力和自我护理的提及增加,这表明由于疾病进展,对身体残疾的关注度增加。
结论
经典的医学数据库仅代表患者整个治疗经历的一部分,仅使用此类数据库进行分析无法代表患者层面的症状或患者对治疗的关注。本研究表明,分析日记博客可以提供关于患者层面细节的附加信息,从而推进以患者为中心的药物警戒学。
相似文献
JMIR Public Health Surveill. 2017-2-24
Stud Health Technol Inform. 2024-8-22
引用本文的文献
本文引用的文献
Comput Biol Med. 2020-7
J Am Med Inform Assoc. 2019-12-1
Nat Rev Rheumatol. 2017-8