Macarulla Rodriguez Andrea, Tiberius Christian, van Bree Roel, Geradts Zeno
Netherlands Forensic Institute, Den Haag, The Netherlands.
Delft University of Technology, Delft, The Netherlands.
Forensic Sci Res. 2018 Oct 23;3(3):240-255. doi: 10.1080/20961790.2018.1509187. eCollection 2018.
Google Location Timeline, once activated, allows to track devices and save their locations. This feature might be useful in the future as available data for evidence in investigations. For that, the court would be interested in the reliability of this data. The position is presented in the form of a pair of coordinates and a radius, hence the estimated area for tracked device is enclosed by a circle. This research focuses on the assessment of the accuracy of the locations given by Google Location History Timeline, which variables affect this accuracy and the initial steps to develop a linear multivariate model that can potentially predict the actual error with respect to the true location considering environmental variables. The determination of the potential influential variables (configuration of mobile device connectivity, speed of movement and environment) was set through a series of experiments in which the true position of the device was recorded with a reference Global Positioning System (GPS) device with a superior order of accuracy. The accuracy was assessed measuring the distance between the Google provided position and the de facto one, later referred to as Google error. If this Google error distance is less than the radius provided, we define it as a hit. The configuration that has the largest hit rate is when the mobile device has GPS available, with a 52% success. Then the use of 3G and 2G connection go with 38% and 33% respectively. The Wi-Fi connection only has a hit rate of 7%. Regarding the means of transport, when the connection is 2G or 3G, the worst results are in Still with a hit rate of 9% and the best in Car with 57%. Regarding the prediction model, the distances and angles from the position of the device to the three nearest cell towers, and the categorical (non-numerical) variables of Environment and means of transport were taking as input variables in this initial study. To evaluate the usability of a model, a Model hit is defined when the actual observation is within the 95% confidence interval provided by the model. Out of the models developed, the one that shows the best results was the one that predicted the accuracy when the used network is 2G, with 76% of Model hits. The second model with best performance had only a 23% success (with the mobile network set to 3G).
谷歌位置时间轴一旦激活,就能追踪设备并保存其位置。这一功能在未来作为调查证据的可用数据可能会很有用。为此,法庭会关注这些数据的可靠性。位置以一对坐标和一个半径的形式呈现,因此被追踪设备的估计区域由一个圆圈包围。本研究聚焦于评估谷歌位置历史时间轴给出的位置的准确性,哪些变量会影响这种准确性,以及开发一个线性多变量模型的初步步骤,该模型能够考虑环境变量潜在地预测相对于真实位置的实际误差。通过一系列实验确定潜在的影响变量(移动设备连接配置、移动速度和环境),在这些实验中,使用精度更高的参考全球定位系统(GPS)设备记录设备的真实位置。通过测量谷歌提供的位置与实际位置之间的距离来评估准确性,该距离后来称为谷歌误差。如果这个谷歌误差距离小于提供的半径,我们将其定义为命中。命中率最高的配置是移动设备可用GPS时,成功率为52%。然后使用3G和2G连接的成功率分别为38%和33%。Wi-Fi连接的命中率仅为7%。关于交通工具,当连接为2G或3G时,最差的结果是静止状态下,命中率为9%,最好的是汽车状态下,命中率为57%。关于预测模型,在这项初步研究中,将设备位置到最近的三个基站的距离和角度,以及环境和交通工具的分类(非数字)变量作为输入变量。为了评估模型的可用性,当实际观测值在模型提供的95%置信区间内时,定义为模型命中。在所开发的模型中,表现最佳的是预测使用2G网络时准确性的模型,模型命中率为76%。性能第二好的模型成功率仅为23%(移动网络设置为3G)。