Mowery Jared, Andrei Amanda, Le Elizabeth, Jian Jing, Ward Megan
The MITRE Corporation.
Online J Public Health Inform. 2016 Dec 28;8(3):e201. doi: 10.5210/ojphi.v8i3.6906. eCollection 2016.
It is challenging to assess the quality of care and detect elder abuse in nursing homes, since patients may be incapable of reporting quality issues or abuse themselves, and resources for sending inspectors are limited.
This study correlates Google reviews of nursing homes with Centers for Medicare and Medicaid Services (CMS) inspection results in the Nursing Home Compare (NHC) data set, to quantify the extent to which the reviews reflect the quality of care and the presence of elder abuse.
A total of 16,160 reviews were collected, spanning 7,170 nursing homes. Two approaches were tested: using the average rating as an overall estimate of the quality of care at a nursing home, and using the average scores from a maximum entropy classifier trained to recognize indications of elder abuse.
The classifier achieved an F-measure of 0.81, with precision 0.74 and recall 0.89. The correlation for the classifier is weak but statistically significant: = 0.13, .001, and 95% confidence interval (0.10, 0.16). The correlation for the ratings exhibits a slightly higher correlation: = 0.15, .001. Both the classifier and rating correlations approach approximately 0.65 when the effective average number of reviews per provider is increased by aggregating similar providers.
These results indicate that an analysis of Google reviews of nursing homes can be used to detect indications of elder abuse with high precision and to assess the quality of care, but only when a sufficient number of reviews are available.
评估养老院的护理质量并发现虐待老人的情况具有挑战性,因为患者可能无法自行报告质量问题或虐待行为,而且派遣检查员的资源有限。
本研究将谷歌上对养老院的评价与医疗保险和医疗补助服务中心(CMS)在“养老院比较”(NHC)数据集中的检查结果相关联,以量化这些评价反映护理质量和虐待老人情况的程度。
共收集了16160条评价,涵盖7170家养老院。测试了两种方法:使用平均评分作为对养老院护理质量的总体估计,以及使用经过训练以识别虐待老人迹象的最大熵分类器的平均分数。
该分类器的F值为0.81,精确率为0.74,召回率为0.89。分类器的相关性较弱但具有统计学意义:r = 0.13,p <.001,95%置信区间为(0.10,0.16)。评分的相关性略高:r = 0.15,p <.001。当通过汇总类似的养老院增加每个养老院的有效平均评价数量时,分类器和评分的相关性都接近0.65。
这些结果表明,对养老院谷歌评价的分析可用于高精度地检测虐待老人的迹象并评估护理质量,但前提是要有足够数量的评价。