• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

可视化交互式机器学习:一种小数据解决方案

Interactive Machine Learning by Visualization: A Small Data Solution.

作者信息

Li Huang, Fang Shiaofen, Mukhopadhyay Snehasis, Saykin Andrew J, Shen Li

机构信息

Department of Computer & Information Science, Indiana University Purdue University Indianapolis.

Indiana University School of Medicine.

出版信息

Proc IEEE Int Conf Big Data. 2018 Dec;2018:3513-3521. doi: 10.1109/BigData.2018.8621952. Epub 2019 Jan 24.

DOI:10.1109/BigData.2018.8621952
PMID:31061990
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6499624/
Abstract

Machine learning algorithms and traditional data mining process usually require a large volume of data to train the algorithm-specific models, with little or no user feedback during the model building process. Such a "big data" based automatic learning strategy is sometimes unrealistic for applications where data collection or processing is very expensive or difficult, such as in clinical trials. Furthermore, expert knowledge can be very valuable in the model building process in some fields such as biomedical sciences. In this paper, we propose a new visual analytics approach to interactive machine learning and visual data mining. In this approach, multi-dimensional data visualization techniques are employed to facilitate user interactions with the machine learning and mining process. This allows dynamic user feedback in different forms, such as data selection, data labeling, and data correction, to enhance the efficiency of model building. In particular, this approach can significantly reduce the amount of data required for training an accurate model, and therefore can be highly impactful for applications where large amount of data is hard to obtain. The proposed approach is tested on two application problems: the handwriting recognition (classification) problem and the human cognitive score prediction (regression) problem. Both experiments show that visualization supported interactive machine learning and data mining can achieve the same accuracy as an automatic process can with much smaller training data sets.

摘要

机器学习算法和传统数据挖掘过程通常需要大量数据来训练特定算法的模型,在模型构建过程中几乎没有用户反馈或完全没有用户反馈。这种基于“大数据”的自动学习策略对于数据收集或处理非常昂贵或困难的应用(如临床试验)来说有时是不现实的。此外,在生物医学科学等一些领域,专家知识在模型构建过程中可能非常有价值。在本文中,我们提出了一种用于交互式机器学习和可视化数据挖掘的新视觉分析方法。在这种方法中,采用多维数据可视化技术来促进用户与机器学习和挖掘过程的交互。这允许以不同形式进行动态用户反馈,例如数据选择、数据标注和数据校正,以提高模型构建的效率。特别是,这种方法可以显著减少训练准确模型所需的数据量,因此对于难以获取大量数据的应用可能具有高度影响力。所提出的方法在两个应用问题上进行了测试:手写识别(分类)问题和人类认知分数预测(回归)问题。两个实验均表明,可视化支持的交互式机器学习和数据挖掘在训练数据集小得多的情况下可以达到与自动过程相同的准确率。

相似文献

1
Interactive Machine Learning by Visualization: A Small Data Solution.可视化交互式机器学习:一种小数据解决方案
Proc IEEE Int Conf Big Data. 2018 Dec;2018:3513-3521. doi: 10.1109/BigData.2018.8621952. Epub 2019 Jan 24.
2
An efficient data preprocessing approach for large scale medical data mining.一种用于大规模医学数据挖掘的高效数据预处理方法。
Technol Health Care. 2015;23(2):153-60. doi: 10.3233/THC-140887.
3
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测:机器学习在 1 型糖尿病中的应用。
Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.
4
Interactive prostate segmentation using atlas-guided semi-supervised learning and adaptive feature selection.使用图谱引导的半监督学习和自适应特征选择进行交互式前列腺分割
Med Phys. 2014 Nov;41(11):111715. doi: 10.1118/1.4898200.
5
The Role of Teamwork in the Analysis of Big Data: A Study of Visual Analytics and Box Office Prediction.团队合作在大数据分析中的作用:一项关于可视化分析和票房预测的研究。
Big Data. 2017 Mar;5(1):53-66. doi: 10.1089/big.2016.0044. Epub 2017 Mar 10.
6
Interactive machine learning for health informatics: when do we need the human-in-the-loop?健康信息学中的交互式机器学习:何时需要人工介入?
Brain Inform. 2016 Jun;3(2):119-131. doi: 10.1007/s40708-016-0042-6. Epub 2016 Mar 2.
7
PredicT-ML: a tool for automating machine learning model building with big clinical data.PredicT-ML:一个利用大型临床数据自动化机器学习模型构建的工具。
Health Inf Sci Syst. 2016 Jun 8;4:5. doi: 10.1186/s13755-016-0018-1. eCollection 2016.
8
P6: A Declarative Language for Integrating Machine Learning in Visual Analytics.P6:一种在可视化分析中集成机器学习的声明式语言。
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):380-389. doi: 10.1109/TVCG.2020.3030453. Epub 2021 Jan 28.
9
Visual Analysis of Discrimination in Machine Learning.机器学习中的歧视可视化分析。
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):1470-1480. doi: 10.1109/TVCG.2020.3030471. Epub 2021 Jan 28.
10
Complex extreme learning machine applications in terahertz pulsed signals feature sets.复杂极限学习机在太赫兹脉冲信号特征集方面的应用。
Comput Methods Programs Biomed. 2014 Nov;117(2):387-403. doi: 10.1016/j.cmpb.2014.06.002. Epub 2014 Jun 21.

引用本文的文献

1
Design of New Dispersants Using Machine Learning and Visual Analytics.利用机器学习和视觉分析设计新型分散剂
Polymers (Basel). 2023 Mar 6;15(5):1324. doi: 10.3390/polym15051324.

本文引用的文献

1
Visualizing the Hidden Activity of Artificial Neural Networks.可视化人工神经网络的隐藏活动。
IEEE Trans Vis Comput Graph. 2017 Jan;23(1):101-110. doi: 10.1109/TVCG.2016.2598838.
2
Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers.方块:支持多类分类器的交互式性能分析。
IEEE Trans Vis Comput Graph. 2017 Jan;23(1):61-70. doi: 10.1109/TVCG.2016.2598828.
3
Towards Better Analysis of Deep Convolutional Neural Networks.深度学习卷积神经网络的分析方法研究进展
IEEE Trans Vis Comput Graph. 2017 Jan;23(1):91-100. doi: 10.1109/TVCG.2016.2598831. Epub 2016 Aug 9.
4
TopicPanorama: A Full Picture of Relevant Topics.主题全景:相关主题的全貌。
IEEE Trans Vis Comput Graph. 2016 Dec;22(12):2508-2521. doi: 10.1109/TVCG.2016.2515592. Epub 2016 Jan 7.
5
An Uncertainty-Aware Approach for Exploratory Microblog Retrieval.
IEEE Trans Vis Comput Graph. 2016 Jan;22(1):250-9. doi: 10.1109/TVCG.2015.2467554.
6
An Approach to Supporting Incremental Visual Data Classification.一种支持增量视觉数据分类的方法。
IEEE Trans Vis Comput Graph. 2015 Jan;21(1):4-17. doi: 10.1109/TVCG.2014.2331979.
7
Visual Methods for Analyzing Probabilistic Classification Data.概率分类数据分析的可视化方法。
IEEE Trans Vis Comput Graph. 2014 Dec;20(12):1703-12. doi: 10.1109/TVCG.2014.2346660.
8
Cortical surface biomarkers for predicting cognitive outcomes using group l2,1 norm.使用组 l2,1 范数预测认知结果的皮质表面生物标志物。
Neurobiol Aging. 2015 Jan;36 Suppl 1:S185-93. doi: 10.1016/j.neurobiolaging.2014.07.045. Epub 2014 Aug 29.
9
Sparse Multi-Task Regression and Feature Selection to Identify Brain Imaging Predictors for Memory Performance.用于识别记忆表现的脑成像预测指标的稀疏多任务回归与特征选择
Proc IEEE Int Conf Comput Vis. 2011:557-562. doi: 10.1109/ICCV.2011.6126288.
10
Identifying the neuroanatomical basis of cognitive impairment in Alzheimer's disease by correlation- and nonlinearity-aware sparse Bayesian learning.通过相关性和非线性感知稀疏贝叶斯学习确定阿尔茨海默病认知障碍的神经解剖学基础。
IEEE Trans Med Imaging. 2014 Jul;33(7):1475-87. doi: 10.1109/TMI.2014.2314712. Epub 2014 Apr 1.