Suppr超能文献

搜索引擎数据如何增进对印度自杀决定因素的理解并为预防工作提供信息:观察性研究

How Search Engine Data Enhance the Understanding of Determinants of Suicide in India and Inform Prevention: Observational Study.

作者信息

Adler Natalia, Cattuto Ciro, Kalimeri Kyriaki, Paolotti Daniela, Tizzoni Michele, Verhulst Stefaan, Yom-Tov Elad, Young Andrew

机构信息

United Nations International Children's Emergency Fund (UNICEF), New York, NY, United States.

ISI Foundation, Torino, Italy.

出版信息

J Med Internet Res. 2019 Jan 4;21(1):e10179. doi: 10.2196/10179.

Abstract

BACKGROUND

India is home to 20% of the world's suicide deaths. Although statistics regarding suicide in India are distressingly high, data and cultural issues likely contribute to a widespread underreporting of the problem. Social stigma and only recent decriminalization of suicide are among the factors hampering official agencies' collection and reporting of suicide rates.

OBJECTIVE

As the product of a data collaborative, this paper leverages private-sector search engine data toward gaining a fuller, more accurate picture of the suicide issue among young people in India. By combining official statistics on suicide with data generated through search queries, this paper seeks to: add an additional layer of information to more accurately represent the magnitude of the problem, determine whether search query data can serve as an effective proxy for factors contributing to suicide that are not represented in traditional datasets, and consider how data collaboratives built on search query data could inform future suicide prevention efforts in India and beyond.

METHODS

We combined official statistics on demographic information with data generated through search queries from Bing to gain insight into suicide rates per state in India as reported by the National Crimes Record Bureau of India. We extracted English language queries on "suicide," "depression," "hanging," "pesticide," and "poison". We also collected data on demographic information at the state level in India, including urbanization, growth rate, sex ratio, internet penetration, and population. We modeled the suicide rate per state as a function of the queries on each of the 5 topics considered as linear independent variables. A second model was built by integrating the demographic information as additional linear independent variables.

RESULTS

Results of the first model fit (R) when modeling the suicide rates from the fraction of queries in each of the 5 topics, as well as the fraction of all suicide methods, show a correlation of about 0.5. This increases significantly with the removal of 3 outliers and improves slightly when 5 outliers are removed. Results for the second model fit using both query and demographic data show that for all categories, if no outliers are removed, demographic data can model suicide rates better than query data. However, when 3 outliers are removed, query data about pesticides or poisons improves the model over using demographic data.

CONCLUSIONS

In this work, we used search data and demographics to model suicide rates. In this way, search data serve as a proxy for unmeasured (hidden) factors corresponding to suicide rates. Moreover, our procedure for outlier rejection serves to single out states where the suicide rates have substantially different correlations with demographic factors and query rates.

摘要

背景

全球20%的自杀死亡案例发生在印度。尽管印度的自杀统计数据高得令人痛心,但数据和文化问题可能导致该问题普遍漏报。社会 stigma以及自杀最近才被非刑罪化是阻碍官方机构收集和报告自杀率的因素之一。

目的

作为数据合作的产物,本文利用私营部门搜索引擎数据,以更全面、准确地了解印度年轻人中的自杀问题。通过将自杀官方统计数据与搜索查询产生的数据相结合,本文旨在:增加一层额外信息以更准确地反映问题的严重程度,确定搜索查询数据是否可作为传统数据集中未体现的自杀相关因素的有效替代指标,并思考基于搜索查询数据构建的数据合作如何为印度及其他地区未来的自杀预防工作提供信息。

方法

我们将人口统计信息的官方统计数据与必应搜索查询产生的数据相结合,以深入了解印度国家犯罪记录局报告的印度各邦自杀率。我们提取了关于“自杀”“抑郁”“上吊”“农药”和“毒药”的英文查询。我们还收集了印度各邦的人口统计信息数据,包括城市化、增长率、性别比、互联网普及率和人口。我们将每个邦的自杀率建模为所考虑的5个主题中每个主题的查询的函数,将其视为线性独立变量。通过将人口统计信息作为额外的线性独立变量进行整合,构建了第二个模型。

结果

在对5个主题中每个主题的查询比例以及所有自杀方式的比例进行自杀率建模时,第一个模型拟合(R)结果显示相关性约为0.5。去除3个异常值后,相关性显著增加,去除5个异常值时略有改善。使用查询和人口统计数据进行的第二个模型拟合结果表明,对于所有类别,如果不去除异常值,人口统计数据对自杀率的建模效果优于查询数据。然而,去除3个异常值后,关于农药或毒药的查询数据比使用人口统计数据能更好地改进模型。

结论

在这项工作中,我们使用搜索数据和人口统计数据对自杀率进行建模。通过这种方式,搜索数据可作为与自杀率相关的未测量(隐藏)因素的替代指标。此外,我们的异常值剔除程序有助于挑出那些自杀率与人口因素和查询率相关性差异较大的邦。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5e/6682304/18e4b86ec492/jmir_v21i1e10179_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验