Suppr超能文献

NAPS融合:一个克服实验数据局限性以预测人类表现和认知任务结果的框架。

NAPS Fusion: A framework to overcome experimental data limitations to predict human performance and cognitive task outcomes.

作者信息

Napoli Nicholas J, Stephens Chad L, Kennedy Kellie D, Barnes Laura E, Juarez Garcia Ezequiel, Harrivel Angela R

机构信息

Human Informatics and Predictive Performance Optimization Laboratory, Electrical and Computer Engineering, University of Florida, Gainesville, FL, 32611, USA.

National Institute of Aerospace, Hampton, VA 23666, USA.

出版信息

Inf Fusion. 2023 Mar;91:15-30. doi: 10.1016/j.inffus.2022.09.016. Epub 2022 Sep 27.

Abstract

In the area of human performance and cognitive research, machine learning (ML) problems become increasingly complex due to limitations in the experimental design, resulting in the development of poor predictive models. More specifically, experimental study designs produce very few data instances, have large class imbalances and conflicting ground truth labels, and generate wide data sets due to the diverse amount of sensors. From an ML perspective these problems are further exacerbated in anomaly detection cases where class imbalances occur and there are almost always more features than samples. Typically, dimensionality reduction methods (e.g., PCA, autoencoders) are utilized to handle these issues from wide data sets. However, these dimensionality reduction methods do not always map to a lower dimensional space appropriately, and they capture noise or irrelevant information. In addition, when new sensor modalities are incorporated, the entire ML paradigm has to be remodeled because of new dependencies introduced by the new information. Remodeling these ML paradigms is time-consuming and costly due to lack of modularity in the paradigm design, which is not ideal. Furthermore, human performance research experiments, at times, creates ambiguous class labels because the ground truth data cannot be agreed upon by subject-matter experts annotations, making ML paradigm nearly impossible to model. This work pulls insights from Dempster-Shafer theory (DST), stacking of ML models, and bagging to address uncertainty and ignorance for multi-classification ML problems caused by ambiguous ground truth, low samples, subject-to-subject variability, class imbalances, and wide data sets. Based on these insights, we propose a probabilistic model fusion approach, Naive Adaptive Probabilistic Sensor (NAPS), which combines ML paradigms built around bagging algorithms to overcome these experimental data concerns while maintaining a modular design for future sensor (new feature integration) and conflicting ground truth data. We demonstrate significant overall performance improvements using NAPS (an accuracy of 95.29%) in detecting human task errors (a four class problem) caused by impaired cognitive states and a negligible drop in performance with the case of ambiguous ground truth labels (an accuracy of 93.93%), when compared to other methodologies (an accuracy of 64.91%). This work potentially sets the foundation for other human-centric modeling systems that rely on human state prediction modeling.

摘要

在人类行为与认知研究领域,由于实验设计的局限性,机器学习(ML)问题变得日益复杂,导致预测模型效果不佳。更具体地说,实验研究设计产生的数据实例极少,存在严重的类别不平衡以及相互冲突的真实标签,并且由于传感器数量多样而生成了广泛的数据集。从机器学习的角度来看,在异常检测案例中,这些问题会进一步恶化,因为会出现类别不平衡,而且特征几乎总是多于样本。通常,会使用降维方法(例如主成分分析、自动编码器)来处理来自广泛数据集的这些问题。然而,这些降维方法并不总是能恰当地映射到低维空间,它们会捕捉到噪声或无关信息。此外,当纳入新的传感器模式时,由于新信息引入了新的依赖关系,整个机器学习范式都必须重新构建。由于范式设计缺乏模块化,重新构建这些机器学习范式既耗时又昂贵,这并不理想。此外,人类行为研究实验有时会产生模糊的类别标签,因为主题专家的注释无法就真实数据达成一致,这使得机器学习范式几乎无法建模。这项工作借鉴了邓普斯特 - 沙弗理论(DST)、机器学习模型堆叠和装袋法的见解,以解决由模糊的真实情况、低样本量、个体差异、类别不平衡和广泛数据集导致的多分类机器学习问题中的不确定性和无知性。基于这些见解,我们提出了一种概率模型融合方法,即朴素自适应概率传感器(NAPS),它结合了围绕装袋算法构建的机器学习范式,以克服这些实验数据问题,同时为未来的传感器(新特征集成)和相互冲突的真实数据保持模块化设计。与其他方法(准确率为64.91%)相比,我们使用NAPS在检测由认知状态受损引起的人类任务错误(一个四类问题)时展示了显著的整体性能提升(准确率为95.29%),并且在存在模糊真实标签的情况下性能下降可忽略不计(准确率为93.93%)。这项工作可能为其他依赖人类状态预测建模的以人类为中心的建模系统奠定基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ca7/10266717/ec1533a66bf8/nihms-1849972-f0001.jpg

相似文献

7
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
10
Partial Multi-Label Learning With Noisy Label Identification.基于噪声标签识别的部分多标签学习
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3676-3687. doi: 10.1109/TPAMI.2021.3059290. Epub 2022 Jun 3.

本文引用的文献

3
Uncertainty in heart rate complexity metrics caused by R-peak perturbations.心率复杂度指标中 R 波峰扰动引起的不确定性。
Comput Biol Med. 2018 Dec 1;103:198-207. doi: 10.1016/j.compbiomed.2018.10.009. Epub 2018 Oct 17.
6
Predicting cognitive state from eye movements.从眼球运动预测认知状态。
PLoS One. 2013 May 29;8(5):e64937. doi: 10.1371/journal.pone.0064937. Print 2013.
7
Sample size planning for classification models.分类模型的样本量规划。
Anal Chim Acta. 2013 Jan 14;760:25-33. doi: 10.1016/j.aca.2012.11.007. Epub 2012 Nov 17.
9
A problem of dimensionality: a simple example.维度问题:一个简单的例子。
IEEE Trans Pattern Anal Mach Intell. 1979 Mar;1(3):306-7. doi: 10.1109/tpami.1979.4766926.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验