Ofer Dan, Kaufman Hadasah, Linial Michal
Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
Heliyon. 2023 Dec 15;10(1):e23781. doi: 10.1016/j.heliyon.2023.e23781. eCollection 2024 Jan 15.
Scientific research trends and interests evolve over time. The ability to identify and forecast these trends is vital for educational institutions, practitioners, investors, and funding organizations. In this study, we predict future trends in scientific publications using heterogeneous sources, including historical publication time series from PubMed, research and review articles, pre-trained language models, and patents. We demonstrate that scientific topic popularity levels and changes (trends) can be predicted five years in advance across 40 years and 125 diverse topics, including life-science concepts, biomedical, anatomy, and other science, technology, and engineering topics. Preceding publications and future patents are leading indicators for emerging scientific topics. We find the ratio of reviews to original research articles informative for identifying increasing or declining topics, with declining topics having an excess of reviews. We find that language models provide improved insights and predictions into temporal dynamics. In temporal validation, our models substantially outperform the historical baseline. Our findings suggest that similar dynamics apply across other scientific and engineering research topics. We present SciTrends, a user-friendly webtool for predicting future publication trends for any topic covered in PubMed.
科学研究趋势和兴趣会随着时间而演变。识别和预测这些趋势的能力对教育机构、从业者、投资者和资助组织至关重要。在本研究中,我们使用多种异构数据源预测科学出版物的未来趋势,这些数据源包括来自PubMed的历史出版时间序列、研究和综述文章、预训练语言模型以及专利。我们证明,在跨越40年的时间里,针对125个不同主题,包括生命科学概念、生物医学、解剖学以及其他科学、技术和工程主题,科学主题的受欢迎程度及其变化(趋势)能够提前五年被预测。先前的出版物和未来的专利是新兴科学主题的领先指标。我们发现综述与原创研究文章的比例对于识别主题的上升或下降很有参考价值,下降的主题有过多的综述。我们发现语言模型能为时间动态提供更好的见解和预测。在时间验证中,我们的模型显著优于历史基线。我们的研究结果表明,类似的动态适用于其他科学和工程研究主题。我们展示了SciTrends,这是一个用户友好的网络工具,用于预测PubMed涵盖的任何主题的未来出版趋势。