Suppr超能文献

数字流行病学:在流行病学研究中使用为非流行病学目的收集的数字数据。

Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies.

作者信息

Park Hyeoun-Ae, Jung Hyesil, On Jeongah, Park Seul Ki, Kang Hannah

机构信息

College of Nursing, Seoul National University, Seoul, Korea.

出版信息

Healthc Inform Res. 2018 Oct;24(4):253-262. doi: 10.4258/hir.2018.24.4.253. Epub 2018 Oct 31.

Abstract

OBJECTIVES

We reviewed digital epidemiological studies to characterize how researchers are using digital data by topic domain, study purpose, data source, and analytic method.

METHODS

We reviewed research articles published within the last decade that used digital data to answer epidemiological research questions. Data were abstracted from these articles using a data collection tool that we developed. Finally, we summarized the characteristics of the digital epidemiological studies.

RESULTS

We identified six main topic domains: infectious diseases (58.7%), non-communicable diseases (29.4%), mental health and substance use (8.3%), general population behavior (4.6%), environmental, dietary, and lifestyle (4.6%), and vital status (0.9%). We identified four categories for the study purpose: description (22.9%), exploration (34.9%), explanation (27.5%), and prediction and control (14.7%). We identified eight categories for the data sources: web search query (52.3%), social media posts (31.2%), web portal posts (11.9%), webpage access logs (7.3%), images (7.3%), mobile phone network data (1.8%), global positioning system data (1.8%), and others (2.8%). Of these, 50.5% used correlation analyses, 41.3% regression analyses, 25.6% machine learning, and 19.3% descriptive analyses.

CONCLUSIONS

Digital data collected for non-epidemiological purposes are being used to study health phenomena in a variety of topic domains. Digital epidemiology requires access to large datasets and advanced analytics. Ensuring open access is clearly at odds with the desire to have as little personal data as possible in these large datasets to protect privacy. Establishment of data cooperatives with restricted access may be a solution to this dilemma.

摘要

目的

我们回顾了数字流行病学研究,以按主题领域、研究目的、数据来源和分析方法来描述研究人员如何使用数字数据。

方法

我们回顾了过去十年内发表的使用数字数据回答流行病学研究问题的研究文章。使用我们开发的数据收集工具从这些文章中提取数据。最后,我们总结了数字流行病学研究的特征。

结果

我们确定了六个主要主题领域:传染病(58.7%)、非传染性疾病(29.4%)、心理健康与物质使用(8.3%)、一般人群行为(4.6%)、环境、饮食和生活方式(4.6%)以及生命状态(0.9%)。我们确定了研究目的的四个类别:描述(22.9%)、探索(34.9%)、解释(27.5%)以及预测与控制(14.7%)。我们确定了数据来源的八个类别:网络搜索查询(52.3%)、社交媒体帖子(31.2%)、网络门户帖子(11.9%)、网页访问日志(7.3%)、图像(7.3%)、移动电话网络数据(1.8%)、全球定位系统数据(1.8%)以及其他(2.8%)。其中,50.5%使用相关性分析,41.3%使用回归分析,25.6%使用机器学习,19.3%使用描述性分析。

结论

为非流行病学目的收集的数字数据正被用于研究各种主题领域的健康现象。数字流行病学需要访问大型数据集并进行高级分析。确保开放获取显然与在这些大型数据集中尽可能少地包含个人数据以保护隐私的愿望相矛盾。建立访问受限的数据合作社可能是解决这一困境的办法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05d6/6230537/819da9991452/hir-24-253-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验