Latif Ghazanfar, Mohammad Nazeeruddin, Alghazo Jaafar
Department of Computer Science, Prince Mohammad bin Fahd University, Al Khobar, Saudi Arabia.
Department of Computer Sciences and Mathematics, Université du Québec à Chicoutimi, 555 boulevard de l'Université, Québec, Canada.
Data Brief. 2023 Aug 28;50:109524. doi: 10.1016/j.dib.2023.109524. eCollection 2023 Oct.
A dataset of fully labeled images of 20 different kinds of fruits is developed for research purposes in the area of detection, recognition, and classification of fruits. Applications can range from fruit recognition to calorie estimation, and other innovative applications. Using this dataset, researchers are given the opportunity to research and develop automatic systems for the detection and recognition of fruit images using deep learning algorithms, computer vision, and machine learning algorithms. The main contribution is a very large dataset of fully labeled images that are publicly accessible and available for all researchers free of charge. The dataset is called "DeepFruit", which consists of 21,122 fruit images for 8 different fruit set combinations. Each image contains a different combination of four or five fruits. The fruit images were captured on different plate sizes, shapes, and colors with varying angles, brightness levels, and distances. The dataset images were captured with various angles and distances but could be cleared by utilizing the preprocessing techniques that allow for noise removal, centering of the image, and others. Preprocessing was done on the dataset such as image rotation & cropping, scale normalization, and others to make the images uniform. The dataset is randomly partitioned into an 80% training set (16,899 images) and a 20% testing set (4,223 images). The dataset along with the labels is publicly accessible at: https://data.mendeley.com/datasets/5prc54r4rt.
为了水果检测、识别和分类领域的研究目的,开发了一个包含20种不同水果的全标注图像数据集。其应用范围可以从水果识别到卡路里估计以及其他创新应用。利用这个数据集,研究人员有机会使用深度学习算法、计算机视觉和机器学习算法来研究和开发用于水果图像检测和识别的自动系统。主要贡献是一个非常大的全标注图像数据集,该数据集可公开获取且所有研究人员均可免费使用。这个数据集名为“DeepFruit”,它由针对8种不同水果组合的21,122张水果图像组成。每张图像包含四到五种水果的不同组合。水果图像是在不同的盘子尺寸、形状和颜色上,以不同的角度、亮度水平和距离拍摄的。数据集图像是以各种角度和距离拍摄的,但可以通过使用允许去除噪声、图像居中及其他操作的预处理技术进行清理。对数据集进行了诸如图像旋转和裁剪、尺度归一化等预处理,以使图像统一。该数据集被随机划分为80%的训练集(16,899张图像)和20%的测试集(4,223张图像)。该数据集及其标签可在以下网址公开获取:https://data.mendeley.com/datasets/5prc54r4rt 。