Oak Ridge Associated Universities (ORAU), Division of Injury Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
Division of Injury Prevention, Centers for Disease Control and Prevention, Atlanta, Georgia, USA.
Inj Prev. 2022 Feb;28(1):74-80. doi: 10.1136/injuryprev-2021-044322. Epub 2021 Aug 19.
The purpose of this research is to identify how data science is applied in suicide prevention literature, describe the current landscape of this literature and highlight areas where data science may be useful for future injury prevention research.
We conducted a literature review of injury prevention and data science in April 2020 and January 2021 in three databases.
For the included 99 articles, we extracted the following: (1) author(s) and year; (2) title; (3) study approach (4) reason for applying data science method; (5) data science method type; (6) study description; (7) data source and (8) focus on a disproportionately affected population.
Results showed the literature on data science and suicide more than doubled from 2019 to 2020, with articles with individual-level approaches more prevalent than population-level approaches. Most population-level articles applied data science methods to describe (n=10) outcomes, while most individual-level articles identified risk factors (n=27). Machine learning was the most common data science method applied in the studies (n=48). A wide array of data sources was used for suicide research, with most articles (n=45) using social media and web-based behaviour data. Eleven studies demonstrated the value of applying data science to suicide prevention literature for disproportionately affected groups.
Data science techniques proved to be effective tools in describing suicidal thoughts or behaviour, identifying individual risk factors and predicting outcomes. Future research should focus on identifying how data science can be applied in other injury-related topics.
本研究旨在确定数据科学在自杀预防文献中的应用方式,描述该文献的现状,并强调数据科学在未来伤害预防研究中可能有用的领域。
我们于 2020 年 4 月和 2021 年 1 月在三个数据库中进行了伤害预防和数据科学文献回顾。
对于纳入的 99 篇文章,我们提取了以下信息:(1)作者和年份;(2)标题;(3)研究方法;(4)应用数据科学方法的原因;(5)数据科学方法类型;(6)研究描述;(7)数据源;(8)关注高发病率人群。
结果表明,2019 年至 2020 年,关于数据科学和自杀的文献增加了一倍多,个体层面方法的文章比人群层面方法的文章更为普遍。大多数人群层面的文章应用数据科学方法来描述(n=10)结局,而大多数个体层面的文章则确定了风险因素(n=27)。机器学习是研究中应用最广泛的数据科学方法(n=48)。自杀研究中使用了广泛的数据源,大多数文章(n=45)使用社交媒体和基于网络的行为数据。11 项研究表明,将数据科学应用于自杀预防文献对于高发病率人群具有价值。
数据科学技术已被证明是描述自杀想法或行为、确定个体风险因素和预测结局的有效工具。未来的研究应侧重于确定如何将数据科学应用于其他与伤害相关的主题。