Alirezaei Mitra, Nguyen Quynh C, Whitaker Ross, Tasdizen Tolga
Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT 84112, USA.
Department of Epidemiology and Biostatistics, University of Maryland School of Public Health, College Park, MD 20742, USA.
IEEE Access. 2023;11:73330-73339. doi: 10.1109/access.2023.3295777. Epub 2023 Jul 14.
This paper aims to address the challenges associated with evaluating the impact of neighborhood environments on health outcomes. Google street view (GSV) images provide a valuable tool for assessing neighborhood environments on a large scale. By annotating the GSV images with labels indicating the presence or absence of specific neighborhood features, we can develop classifiers capable of automatically analyzing and evaluating the environment. However, the process of labeling GSV images to analyze and evaluate the environment is a time-consuming and labor-intensive task. To overcome these challenges, we propose using a multi-task classifier to enhance the training of classifiers with limited supervised GSV data. Our multi-task classifier utilizes readily available, inexpensive online images collected from Flickr as a related classification task. The hypothesis is that a classifier trained on multiple related tasks is less likely to overfit to small amounts of training data and generalizes better to unseen data. We leverage the power of multiple related tasks to improve the classifier's overall performance and generalization capability. Here we show that, with the proposed learning paradigm, predicted labels for GSV test images are more accurate. Across different environment indicators, the accuracy, score and balanced accuracy increase up to 6 % in the multi-task learning framework compared to its single-task learning counterpart. The enhanced accuracy of the predicted labels obtained through the multi-task classifier contributes to a more reliable and precise regression analysis determining the correlation between predicted built environment indicators and health outcomes. The values calculated for different health outcomes improve by up to 4 % using multi-task learning detected indicators.
本文旨在应对与评估邻里环境对健康结果的影响相关的挑战。谷歌街景(GSV)图像为大规模评估邻里环境提供了一个有价值的工具。通过用表示特定邻里特征存在与否的标签对GSV图像进行标注,我们可以开发出能够自动分析和评估环境的分类器。然而,为了分析和评估环境而给GSV图像标注标签的过程是一项耗时且费力的任务。为了克服这些挑战,我们提出使用多任务分类器,以在有限的有监督GSV数据下增强分类器的训练。我们的多任务分类器将从Flickr收集的现成、低成本在线图像用作相关分类任务。假设是,在多个相关任务上训练的分类器不太可能过度拟合少量训练数据,并且对未见数据的泛化能力更好。我们利用多个相关任务的力量来提高分类器的整体性能和泛化能力。在此我们表明,在所提出的学习范式下,GSV测试图像的预测标签更准确。在不同的环境指标中,与单任务学习对应物相比,多任务学习框架中的准确率、分数和平衡准确率提高了6%。通过多任务分类器获得的预测标签的更高准确率有助于进行更可靠、精确的回归分析,以确定预测的建成环境指标与健康结果之间的相关性。使用多任务学习检测到的指标,针对不同健康结果计算的值提高了4%。