Madrid-García Alfredo, Freites-Núñez Dalifer, Merino-Barbancho Beatriz, Pérez Sancristobal Inés, Rodríguez-Rodríguez Luis
Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria San Carlos, Prof. Martin Lagos s/n, Madrid 28040, Spain.
Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria San Carlos, Madrid, Spain.
Ther Adv Musculoskelet Dis. 2024 Dec 23;16:1759720X241308037. doi: 10.1177/1759720X241308037. eCollection 2024.
Rheumatology has experienced notable changes in the last decades. New drugs, including biologic agents and Janus kinase (JAK) inhibitors, have blossomed. Concepts such as window of opportunity, arthralgia suspicious for progression, or difficult-to-treat rheumatoid arthritis (RA) have appeared; and new management approaches and strategies such as treat-to-target have become popular. Statistical learning methods, gene therapy, telemedicine, or precision medicine are other advancements that have gained relevance in the field. To better characterize the research landscape and advances in rheumatology, automatic and efficient approaches based on natural language processing (NLP) should be used.
The objective of this study is to use topic modeling (TM) techniques to uncover key topics and trends in rheumatology research conducted in the last 23 years.
Retrospective study.
This study analyzed 96,004 abstracts published between 2000 and December 31, 2023, drawn from 34 specialized rheumatology journals obtained from PubMed. BERTopic, a novel TM approach that considers semantic relationships among words and their context, was used to uncover topics. Up to 30 different models were trained. Based on the number of topics, outliers, and topic coherence score, two of them were finally selected, and the topics were manually labeled by two rheumatologists. Word clouds and hierarchical clustering visualizations were computed. Finally, hot and cold trends were identified using linear regression models.
Abstracts were classified into 45 and 47 topics. The most frequent topics were RA, systemic lupus erythematosus, and osteoarthritis. Expected topics such as COVID-19 or JAK inhibitors were identified after conducting dynamic TM. Topics such as spinal surgery or bone fractures have gained relevance in recent years; however, antiphospholipid syndrome or septic arthritis have lost momentum.
Our study utilized advanced NLP techniques to analyze the rheumatology research landscape and identify key themes and emerging trends. The results highlight the dynamic and varied nature of rheumatology research, illustrating how interest in certain topics has shifted over time.
在过去几十年里,风湿病学经历了显著变化。包括生物制剂和 Janus 激酶(JAK)抑制剂在内的新药不断涌现。机会之窗、可疑进展性关节痛或难治性类风湿关节炎(RA)等概念已经出现;诸如达标治疗等新的管理方法和策略也变得流行起来。统计学习方法、基因治疗、远程医疗或精准医学是该领域中其他具有重要意义的进展。为了更好地描述风湿病学的研究格局和进展,应采用基于自然语言处理(NLP)的自动且高效的方法。
本研究的目的是使用主题建模(TM)技术来揭示过去 23 年里进行的风湿病学研究中的关键主题和趋势。
回顾性研究。
本研究分析了 2000 年至 2023 年 12 月 31 日期间发表的 96,004 篇摘要,这些摘要来自从 PubMed 获取的 34 种专业风湿病学期刊。BERTopic 是一种新颖的 TM 方法,它考虑单词之间的语义关系及其上下文,用于揭示主题。训练了多达 30 种不同的模型。根据主题数量、异常值和主题连贯得分,最终选择了其中两种,并由两名风湿病学家对主题进行手动标注。计算了词云图和层次聚类可视化。最后,使用线性回归模型确定热点和冷点趋势。
摘要被分为 45 个和 47 个主题。最常见的主题是 RA、系统性红斑狼疮和骨关节炎。在进行动态 TM 后,确定了如 COVID - 19 或 JAK 抑制剂等预期主题。诸如脊柱手术或骨折等主题近年来变得更加重要;然而,抗磷脂综合征或脓毒性关节炎的关注度有所下降。
我们的研究利用先进的 NLP 技术分析了风湿病学研究格局,并确定了关键主题和新出现的趋势。结果突出了风湿病学研究的动态性和多样性,说明了对某些主题的兴趣如何随时间而变化。