Northwestern University Feinberg School of Medicine, Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Chicago, IL, USA.
Northwestern University Feinberg School of Medicine, Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Chicago, IL, USA.
Gynecol Oncol. 2023 May;172:41-46. doi: 10.1016/j.ygyno.2023.03.001. Epub 2023 Mar 16.
There is scant research identifying thematic trends within medical research. This work may provide insight into how a given field values certain topics. We assessed the feasibility of using a machine learning approach to determine the most common research themes published in Gynecologic Oncology over a thirty-year period and to subsequently evaluate how interest in these topics changed over time.
We retrieved the abstracts of all original research published in Gynecologic Oncology from 1990 to 2020 using PubMed. Abstract text was processed through a natural language processing algorithm and clustered into topical themes using latent Dirichlet allocation (LDA) prior to manual labeling. Topics were investigated for temporal trends.
We retrieved 12,586 original research articles, of which 11,217 were evaluable for subsequent analysis. Twenty-three research topics were selected at the completion of topic modeling. The topics of basic science genetics, epidemiologic methods, and chemotherapy experienced the greatest increase over the time period, while postoperative outcomes, reproductive age cancer management, and cervical dysplasia experienced the greatest decline. Interest in basic science research remained relatively constant. Topics were additionally reviewed for words indicative of either surgical or medical therapy. Both surgical and medical topics saw increasing interest, with surgical topics experiencing a greater increase and representing a higher proportion of published topics.
Topic modeling, a type of unsupervised machine learning, was successfully used to identify trends in research themes. The application of this technique provided insight into how the field of gynecologic oncology values the components of its scope of practice and therefore how it may choose to allocate grant funding, disseminate research, and participate in the public discourse.
鲜有研究能确定医学研究中的主题趋势。这项工作可以深入了解特定领域对某些主题的重视程度。我们评估了使用机器学习方法确定《妇科肿瘤学》在 30 年内发表的最常见研究主题的可行性,并随后评估这些主题的兴趣随时间的变化。
我们使用 PubMed 检索了 1990 年至 2020 年发表在《妇科肿瘤学》上的所有原始研究的摘要。使用自然语言处理算法处理摘要文本,并在手动标记之前使用潜在狄利克雷分配(LDA)将其聚类为主题。研究了这些主题的时间趋势。
我们检索到 12586 篇原始研究文章,其中 11217 篇可用于后续分析。完成主题建模后选择了 23 个研究主题。基础科学遗传学、流行病学方法和化疗主题的研究数量在研究期间增长最大,而术后结果、生殖年龄癌症管理和宫颈发育不良主题的研究数量下降最大。基础科学研究的兴趣相对稳定。此外,还审查了主题中表示手术或药物治疗的词语。手术和药物治疗主题都表现出越来越大的兴趣,手术主题的兴趣增长更大,并且代表了发表主题的更高比例。
主题建模是一种无监督机器学习技术,成功地用于识别研究主题趋势。该技术的应用深入了解了妇科肿瘤学领域如何重视其实践范围的组成部分,以及它可能如何选择分配资助资金、传播研究成果以及参与公共讨论。