Suppr超能文献

可视化交互式机器学习:一种小数据解决方案

Interactive Machine Learning by Visualization: A Small Data Solution.

作者信息

Li Huang, Fang Shiaofen, Mukhopadhyay Snehasis, Saykin Andrew J, Shen Li

机构信息

Department of Computer & Information Science, Indiana University Purdue University Indianapolis.

Indiana University School of Medicine.

出版信息

Proc IEEE Int Conf Big Data. 2018 Dec;2018:3513-3521. doi: 10.1109/BigData.2018.8621952. Epub 2019 Jan 24.

Abstract

Machine learning algorithms and traditional data mining process usually require a large volume of data to train the algorithm-specific models, with little or no user feedback during the model building process. Such a "big data" based automatic learning strategy is sometimes unrealistic for applications where data collection or processing is very expensive or difficult, such as in clinical trials. Furthermore, expert knowledge can be very valuable in the model building process in some fields such as biomedical sciences. In this paper, we propose a new visual analytics approach to interactive machine learning and visual data mining. In this approach, multi-dimensional data visualization techniques are employed to facilitate user interactions with the machine learning and mining process. This allows dynamic user feedback in different forms, such as data selection, data labeling, and data correction, to enhance the efficiency of model building. In particular, this approach can significantly reduce the amount of data required for training an accurate model, and therefore can be highly impactful for applications where large amount of data is hard to obtain. The proposed approach is tested on two application problems: the handwriting recognition (classification) problem and the human cognitive score prediction (regression) problem. Both experiments show that visualization supported interactive machine learning and data mining can achieve the same accuracy as an automatic process can with much smaller training data sets.

摘要

机器学习算法和传统数据挖掘过程通常需要大量数据来训练特定算法的模型,在模型构建过程中几乎没有用户反馈或完全没有用户反馈。这种基于“大数据”的自动学习策略对于数据收集或处理非常昂贵或困难的应用(如临床试验)来说有时是不现实的。此外,在生物医学科学等一些领域,专家知识在模型构建过程中可能非常有价值。在本文中,我们提出了一种用于交互式机器学习和可视化数据挖掘的新视觉分析方法。在这种方法中,采用多维数据可视化技术来促进用户与机器学习和挖掘过程的交互。这允许以不同形式进行动态用户反馈,例如数据选择、数据标注和数据校正,以提高模型构建的效率。特别是,这种方法可以显著减少训练准确模型所需的数据量,因此对于难以获取大量数据的应用可能具有高度影响力。所提出的方法在两个应用问题上进行了测试:手写识别(分类)问题和人类认知分数预测(回归)问题。两个实验均表明,可视化支持的交互式机器学习和数据挖掘在训练数据集小得多的情况下可以达到与自动过程相同的准确率。

相似文献

1
Interactive Machine Learning by Visualization: A Small Data Solution.可视化交互式机器学习:一种小数据解决方案
Proc IEEE Int Conf Big Data. 2018 Dec;2018:3513-3521. doi: 10.1109/BigData.2018.8621952. Epub 2019 Jan 24.
8
P6: A Declarative Language for Integrating Machine Learning in Visual Analytics.P6:一种在可视化分析中集成机器学习的声明式语言。
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):380-389. doi: 10.1109/TVCG.2020.3030453. Epub 2021 Jan 28.
9
Visual Analysis of Discrimination in Machine Learning.机器学习中的歧视可视化分析。
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):1470-1480. doi: 10.1109/TVCG.2020.3030471. Epub 2021 Jan 28.
10
Complex extreme learning machine applications in terahertz pulsed signals feature sets.复杂极限学习机在太赫兹脉冲信号特征集方面的应用。
Comput Methods Programs Biomed. 2014 Nov;117(2):387-403. doi: 10.1016/j.cmpb.2014.06.002. Epub 2014 Jun 21.

引用本文的文献

本文引用的文献

1
Visualizing the Hidden Activity of Artificial Neural Networks.可视化人工神经网络的隐藏活动。
IEEE Trans Vis Comput Graph. 2017 Jan;23(1):101-110. doi: 10.1109/TVCG.2016.2598838.
3
Towards Better Analysis of Deep Convolutional Neural Networks.深度学习卷积神经网络的分析方法研究进展
IEEE Trans Vis Comput Graph. 2017 Jan;23(1):91-100. doi: 10.1109/TVCG.2016.2598831. Epub 2016 Aug 9.
4
TopicPanorama: A Full Picture of Relevant Topics.主题全景:相关主题的全貌。
IEEE Trans Vis Comput Graph. 2016 Dec;22(12):2508-2521. doi: 10.1109/TVCG.2016.2515592. Epub 2016 Jan 7.
5
An Uncertainty-Aware Approach for Exploratory Microblog Retrieval.
IEEE Trans Vis Comput Graph. 2016 Jan;22(1):250-9. doi: 10.1109/TVCG.2015.2467554.
6
An Approach to Supporting Incremental Visual Data Classification.一种支持增量视觉数据分类的方法。
IEEE Trans Vis Comput Graph. 2015 Jan;21(1):4-17. doi: 10.1109/TVCG.2014.2331979.
7
Visual Methods for Analyzing Probabilistic Classification Data.概率分类数据分析的可视化方法。
IEEE Trans Vis Comput Graph. 2014 Dec;20(12):1703-12. doi: 10.1109/TVCG.2014.2346660.
8
Cortical surface biomarkers for predicting cognitive outcomes using group l2,1 norm.使用组 l2,1 范数预测认知结果的皮质表面生物标志物。
Neurobiol Aging. 2015 Jan;36 Suppl 1:S185-93. doi: 10.1016/j.neurobiolaging.2014.07.045. Epub 2014 Aug 29.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验