Department of Bioengineering, Stanford University, Palo Alto, CA, 94305, USA.
Pac Symp Biocomput. 2021;26:14-25.
Crowd-powered telemedicine has the potential to revolutionize healthcare, especially during times that require remote access to care. However, sharing private health data with strangers from around the world is not compatible with data privacy standards, requiring a stringent filtration process to recruit reliable and trustworthy workers who can go through the proper training and security steps. The key challenge, then, is to identify capable, trustworthy, and reliable workers through high-fidelity evaluation tasks without exposing any sensitive patient data during the evaluation process. We contribute a set of experimentally validated metrics for assessing the trustworthiness and reliability of crowd workers tasked with providing behavioral feature tags to unstructured videos of children with autism and matched neurotypical controls. The workers are blinded to diagnosis and blinded to the goal of using the features to diagnose autism. These behavioral labels are fed as input to a previously validated binary logistic regression classifier for detecting autism cases using categorical feature vectors. While the metrics do not incorporate any ground truth labels of child diagnosis, linear regression using the 3 correlative metrics as input can predict the mean probability of the correct class of each worker with a mean average error of 7.51% for performance on the same set of videos and 10.93% for performance on a distinct balanced video set with different children. These results indicate that crowd workers can be recruited for performance based largely on behavioral metrics on a crowdsourced task, enabling an affordable way to filter crowd workforces into a trustworthy and reliable diagnostic workforce.
众包远程医疗有可能彻底改变医疗保健行业,特别是在需要远程获得医疗服务的情况下。然而,将私人健康数据与来自世界各地的陌生人共享是不符合数据隐私标准的,这需要一个严格的过滤过程来招募可靠和值得信赖的工作人员,这些工作人员需要经过适当的培训和安全步骤。因此,关键挑战是在不暴露任何敏感患者数据的情况下,通过高保真度评估任务来识别有能力、值得信赖和可靠的工作人员。我们提出了一套经过实验验证的指标,用于评估负责为自闭症儿童和匹配的神经典型对照组的非结构化视频提供行为特征标签的众包工人的可信度和可靠性。这些工人对诊断是盲目的,对使用这些特征来诊断自闭症的目标也是盲目的。这些行为标签作为输入提供给先前验证的二分类逻辑回归分类器,用于使用类别特征向量检测自闭症病例。虽然这些指标没有包含任何儿童诊断的真实标签,但使用 3 个相关指标作为输入的线性回归可以预测每个工人正确类别的平均概率,对于同一组视频的性能,平均平均误差为 7.51%,对于具有不同儿童的不同平衡视频集的性能,平均平均误差为 10.93%。这些结果表明,可以在众包任务中主要根据行为指标招募众包工人,从而为过滤众包劳动力成为值得信赖和可靠的诊断劳动力提供一种经济实惠的方法。