Consoli Sergio, Coletti Pietro, Markov Peter V, Orfei Lia, Biazzo Indaco, Schuh Lea, Stefanovitch Nicolas, Bertolini Lorenzo, Ceresa Mario, Stilianakis Nikolaos I
European Commission, Joint Research Centre (JRC), Ispra, Italy.
Universitè catholique de Louvain, Institute of Health and Society (IRSS), Brussels, Belgium.
Sci Data. 2025 Jun 10;12(1):970. doi: 10.1038/s41597-025-05276-2.
The rapid evolution of artificial intelligence (AI), together with the increased availability of social media and news for epidemiological surveillance, is marking a pivotal moment in epidemiology and public health research. By harnessing the capabilities of generative AI, we use an ensemble approach which incorporates multiple Large Language Models (LLMs) to extract useful epidemiological information for analysis from the World Health Organization (WHO) Disease Outbreak News (DONs). DONs is a collection of regular reports on global outbreaks curated by the WHO with the adopted decision-making processes to respond to them. The extracted information is made available in a knowledge graph, referred to as eKG, derived to provide a nuanced representation of the public health domain knowledge. We provide an overview of this new dataset and describe the structure of eKG, along with the services and tools used to access and utilize the data that we are building on top. These innovative data resources open altogether new opportunities for epidemiological research, and the analysis and surveillance of disease outbreaks.
人工智能(AI)的迅速发展,以及社交媒体和用于流行病学监测的新闻的日益普及,正标志着流行病学和公共卫生研究的一个关键时刻。通过利用生成式人工智能的能力,我们采用了一种集成方法,该方法结合了多个大语言模型(LLMs),以从世界卫生组织(WHO)疾病爆发新闻(DONs)中提取有用的流行病学信息进行分析。DONs是世卫组织策划的关于全球疫情的定期报告集合,并采用了相应的决策流程来应对这些疫情。提取的信息以知识图谱的形式提供,称为eKG,旨在提供对公共卫生领域知识的细致入微的表示。我们概述了这个新数据集,并描述了eKG的结构,以及用于访问和利用我们在此基础上构建的数据的服务和工具。这些创新的数据资源为流行病学研究以及疾病爆发的分析和监测带来了全新的机会。