• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个用于图表识别和可视化进展的不同图表类型的合成数据集。

A synthetic dataset of different chart types for advancements in chart identification and visualization.

作者信息

Bajić Filip, Habijan Marija, Nenadić Krešimir

机构信息

University Computing Centre, University of Zagreb, 10000 Zagreb, Croatia.

Faculty of Electrical Engineering, Computer Science and Information Technology Osijek, 31000 Osijek, Croatia.

出版信息

Data Brief. 2024 Feb 21;53:110233. doi: 10.1016/j.dib.2024.110233. eCollection 2024 Apr.

DOI:10.1016/j.dib.2024.110233
PMID:38435728
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10907168/
Abstract

We introduce a meticulously curated synthetic chart dataset designed to propel algorithm advancements in data visualization and interpretation. The dataset, tailored for training and testing purposes, encompasses a diverse array of chart types, including but not limited to Area, Bar, Box, Donut, Line, Pie, and Scatter. The data collection process involves a fully automatic low-level algorithm focused on extraction of graphical elements. The algorithm ensures efficiency by restricting input images from featuring three-dimensional representations, incorporating any 3D effects, or including multiple charts in a single image. The dataset is categorized into training and testing subsets, further subdivided based on resolutions and specific chart types. The reuse potential of this dataset is substantial. It serves as a valuable resource for driving algorithmic advancements in data visualization classification and interpretation. Researchers can leverage this dataset for training and testing deep models, enhancing the adaptability of their algorithms. Moreover, it establishes a benchmark for evaluating system performance in handling diverse chart visualizations, allowing for direct comparisons, and fostering advancements in data understanding algorithms. The versatility of the dataset, encapsulating various chart types and resolutions, provides a standardized platform for assessing and comparing the effectiveness of different systems in understanding and decomposing visualizations [1,2,3].

摘要

我们引入了一个精心策划的合成图表数据集,旨在推动数据可视化和解释方面的算法进步。该数据集专为训练和测试目的量身定制,涵盖了各种图表类型,包括但不限于面积图、柱状图、箱线图、甜甜圈图、折线图、饼图和散点图。数据收集过程涉及一种全自动的低级算法,专注于图形元素的提取。该算法通过限制输入图像不具有三维表示、不包含任何3D效果或在单个图像中不包含多个图表来确保效率。该数据集被分类为训练子集和测试子集,并根据分辨率和特定图表类型进一步细分。这个数据集的重用潜力很大。它是推动数据可视化分类和解释方面算法进步的宝贵资源。研究人员可以利用这个数据集来训练和测试深度模型,提高其算法的适应性。此外,它还为评估系统处理各种图表可视化的性能建立了一个基准,允许进行直接比较,并促进数据理解算法的进步。该数据集的通用性,涵盖了各种图表类型和分辨率,为评估和比较不同系统在理解和分解可视化方面的有效性提供了一个标准化平台[1,2,3]。

相似文献

1
A synthetic dataset of different chart types for advancements in chart identification and visualization.一个用于图表识别和可视化进展的不同图表类型的合成数据集。
Data Brief. 2024 Feb 21;53:110233. doi: 10.1016/j.dib.2024.110233. eCollection 2024 Apr.
2
Data Extraction of Circular-Shaped and Grid-like Chart Images.圆形和网格状图表图像的数据提取
J Imaging. 2022 May 12;8(5):136. doi: 10.3390/jimaging8050136.
3
NSTU-BDTAKA: An open dataset for Bangladeshi paper currency detection and recognition.NSTU - BDTAKA:一个用于孟加拉国纸币检测与识别的开放数据集。
Data Brief. 2024 Jul 3;55:110701. doi: 10.1016/j.dib.2024.110701. eCollection 2024 Aug.
4
A comprehensive dataset on two-dimensional noble metals: Theoretical insights into physical properties and metal-support interactions.关于二维贵金属的综合数据集:对物理性质和金属-载体相互作用的理论见解。
Data Brief. 2023 Nov 10;51:109801. doi: 10.1016/j.dib.2023.109801. eCollection 2023 Dec.
5
Vehicle image datasets for image classification.用于图像分类的车辆图像数据集。
Data Brief. 2024 Feb 1;53:110133. doi: 10.1016/j.dib.2024.110133. eCollection 2024 Apr.
6
Glanceable Visualization: Studies of Data Comparison Performance on Smartwatches.一目了然的可视化:智能手表上数据比较性能的研究。
IEEE Trans Vis Comput Graph. 2018 Aug 21. doi: 10.1109/TVCG.2018.2865142.
7
A Real-World Approach on the Problem of Chart Recognition Using Classification, Detection and Perspective Correction.使用分类、检测和透视校正解决图表识别问题的一种真实方法。
Sensors (Basel). 2020 Aug 5;20(16):4370. doi: 10.3390/s20164370.
8
Hierarchically Recognizing Vector Graphics and A New Chart-Based Vector Graphics Dataset.
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):7556-7573. doi: 10.1109/TPAMI.2024.3394298. Epub 2024 Nov 6.
9
Task-Based Effectiveness of Basic Visualizations.基于任务的基本可视化效果
IEEE Trans Vis Comput Graph. 2019 Jul;25(7):2505-2512. doi: 10.1109/TVCG.2018.2829750. Epub 2018 May 4.
10
ChartKG: A Knowledge-Graph-Based Representation for Chart Images.ChartKG:一种基于知识图谱的图表图像表示法。
IEEE Trans Vis Comput Graph. 2025 Sep;31(9):5854-5868. doi: 10.1109/TVCG.2024.3476508.

本文引用的文献

1
A Multi-Purpose Shallow Convolutional Neural Network for Chart Images.多用途浅层卷积神经网络在图表图像中的应用。
Sensors (Basel). 2022 Oct 11;22(20):7695. doi: 10.3390/s22207695.
2
Data Extraction of Circular-Shaped and Grid-like Chart Images.圆形和网格状图表图像的数据提取
J Imaging. 2022 May 12;8(5):136. doi: 10.3390/jimaging8050136.
3
Chart Classification Using Siamese CNN.使用暹罗卷积神经网络的图表分类
J Imaging. 2021 Oct 21;7(11):220. doi: 10.3390/jimaging7110220.