Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA.
Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA.
Int J Med Inform. 2021 Oct;154:104558. doi: 10.1016/j.ijmedinf.2021.104558. Epub 2021 Aug 18.
BACKGROUND: The rapid growth of inherently complex and heterogeneous data in HIV/AIDS research underscores the importance of Big Data Science. Recently, there have been increasing uptakes of Big Data techniques in basic, clinical, and public health fields of HIV/AIDS research. However, no studies have systematically elaborated on the evolving applications of Big Data in HIV/AIDS research. We sought to explore the emergence and evolution of Big Data Science in HIV/AIDS-related publications that were funded by the US federal agencies. METHODS: We identified HIV/AIDS and Big Data related publications that were funded by seven federal agencies from 2000 to 2019 by integrating data from National Institutes of Health (NIH) ExPORTER, MEDLINE, and MeSH. Building on bibliometrics and Natural Language Processing (NLP) methods, we constructed co-occurrence networks using bibliographic metadata (e.g., countries, institutes, MeSH terms, and keywords) of the retrieved publications. We then detected clusters among the networks as well as the temporal dynamics of clusters, followed by expert evaluation and clinical implications. RESULTS: We harnessed nearly 600 thousand publications related to HIV/AIDS, of which 19,528 publications relating to Big Data were included in bibliometric analysis. Results showed that (1) the number of Big Data publications has been increasing since 2000, (2) US institutes have been in close collaborations with China, Canada, and Germany, (3) some institutes (e.g., University of California system, MD Anderson Cancer Center, and Harvard Medical School) are among the most productive institutes and started using Big Data in HIV/AIDS research early, (4) Big Data research was not active in public health disciplines until 2015, (5) research topics such as genomics, HIV comorbidities, population-based studies, Electronic Health Records (EHR), social media, precision medicine, and methodologies such as machine learning, Deep Learning, radiomics, and data mining emerge quickly in recent years. CONCLUSIONS: We identified a rapid growth in the cross-disciplinary research of HIV/AIDS and Big Data over the past two decades. Our findings demonstrated patterns and trends of prevailing research topics and Big Data applications in HIV/AIDS research and suggested a number of fast-evolving areas of Big Data Science in HIV/AIDS research including secondary analysis of EHR, machine learning, Deep Learning, predictive analysis, and NLP.
背景:HIV/AIDS 研究中固有复杂且异质数据的快速增长突显了大数据科学的重要性。最近,大数据技术在 HIV/AIDS 研究的基础、临床和公共卫生领域的应用越来越多。然而,尚无研究系统地阐述了大数据在 HIV/AIDS 研究中的不断发展的应用。我们试图探讨由美国联邦机构资助的 HIV/AIDS 相关出版物中大数据科学的出现和发展。
方法:我们通过整合来自美国国立卫生研究院(NIH)ExPORTER、MEDLINE 和 MeSH 的数据,确定了 2000 年至 2019 年期间由七个联邦机构资助的 HIV/AIDS 和大数据相关出版物。基于文献计量学和自然语言处理(NLP)方法,我们使用检索出版物的书目元数据(例如国家、机构、MeSH 术语和关键词)构建了共现网络。然后,我们检测了网络中的聚类以及聚类的时间动态,随后进行了专家评估和临床意义。
结果:我们利用了近 60 万篇与 HIV/AIDS 相关的出版物,其中有 19528 篇与大数据相关的出版物被纳入文献计量学分析。结果表明:(1)自 2000 年以来,大数据出版物的数量一直在增加;(2)美国机构与中国、加拿大和德国密切合作;(3)一些机构(例如加利福尼亚大学系统、MD 安德森癌症中心和哈佛医学院)是最具生产力的机构之一,并早在 HIV/AIDS 研究中就开始使用大数据;(4)直到 2015 年,公共卫生学科的大数据研究才活跃起来;(5)近年来,基因组学、HIV 合并症、基于人群的研究、电子健康记录(EHR)、社交媒体、精准医学以及机器学习、深度学习、放射组学和数据挖掘等方法等研究主题迅速出现。
结论:我们发现,在过去的二十年中,HIV/AIDS 和大数据的跨学科研究呈快速增长。我们的研究结果展示了 HIV/AIDS 研究中主流研究主题和大数据应用的模式和趋势,并提出了 HIV/AIDS 研究中大数据科学的一些快速发展领域,包括 EHR 的二次分析、机器学习、深度学习、预测分析和 NLP。
J Nurs Scholarsh. 2024-5
J Med Internet Res. 2019-11-18
Bull Med Libr Assoc. 2000-1
AIDS Res Ther. 2022-12-21
Front Med (Lausanne). 2025-7-1
Front Res Metr Anal. 2024-11-19
Front Psychol. 2023-3-9
Interact J Med Res. 2023-3-31
AIDS. 2021-5-1
J Am Med Inform Assoc. 2021-1-15
Stud Health Technol Inform. 2019-8-21
J Acquir Immune Defic Syndr. 2019-9-1
Infect Dis Clin North Am. 2019-9