Bosch Oriol J, Revilla Melanie
Department of Methodology The London School of Economics and Political Science London UK.
Research and Expertise Centre for Survey Methodology (RECSM) Universitat Pompeu Fabra Barcelona Spain.
J R Stat Soc Ser A Stat Soc. 2022 Dec;185(Suppl 2):S408-S436. doi: 10.1111/rssa.12956. Epub 2022 Nov 6.
Metered data, also called web-tracking data, are generally collected from a sample of participants who willingly install or configure, onto their devices, technologies that track digital traces left when people go online (e.g., URLs visited). Since metered data allow for the observation of online behaviours unobtrusively, it has been proposed as a useful tool to understand what people do online and what impacts this might have on online and offline phenomena. It is crucial, nevertheless, to understand its limitations. Although some research have explored the potential errors of metered data, a systematic categorisation and conceptualisation of these errors are missing. Inspired by the Total Survey Error, we present a Total Error framework for digital traces collected with Meters (TEM). The TEM framework (1) describes the data generation and the analysis process for metered data and (2) documents the sources of bias and variance that may arise in each step of this process. Using a case study we also show how the TEM can be applied in real life to identify, quantify and reduce metered data errors. Results suggest that metered data might indeed be affected by the error sources identified in our framework and, to some extent, biased. This framework can help improve the quality of both stand-alone metered data research projects, as well as foster the understanding of how and when survey and metered data can be combined.
计量数据,也称为网络跟踪数据,通常是从自愿在其设备上安装或配置跟踪人们上网时留下的数字痕迹(例如访问的网址)的技术的参与者样本中收集的。由于计量数据能够在不引人注意的情况下观察在线行为,因此它被认为是了解人们在线行为以及这些行为可能对在线和离线现象产生何种影响的有用工具。然而,了解其局限性至关重要。尽管一些研究探讨了计量数据的潜在误差,但缺乏对这些误差的系统分类和概念化。受总调查误差的启发,我们提出了一个针对用计量器收集的数字痕迹的总误差框架(TEM)。TEM框架(1)描述了计量数据的数据生成和分析过程,(2)记录了该过程每个步骤中可能出现的偏差和方差来源。通过一个案例研究,我们还展示了TEM如何在现实生活中应用于识别、量化和减少计量数据误差。结果表明,计量数据可能确实受到我们框架中识别出的误差来源的影响,并且在某种程度上存在偏差。这个框架有助于提高独立计量数据研究项目的质量,同时促进对调查数据和计量数据如何以及何时可以结合的理解。