Arunachalam Muthukumar, Gopu T, Uma K
Department of Electronics Communication and Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, Srivilliputhur, Tamil Nadu 626126, India.
Department of Computer Science and Technology, Sasi Institute of Technology & Engineering, West Godavari, Andhra Pradesh 534101, India.
Data Brief. 2025 May 20;61:111660. doi: 10.1016/j.dib.2025.111660. eCollection 2025 Aug.
The identification and classification of medicinal plants are crucial for botanical research, traditional medicine, and AI-driven applications. However, the absence of a standardized, high-quality dataset limits advancements in automated species recognition. This study introduces SIMPD Version 1 (South Indian Medicinal Plants Dataset), a curated dataset comprising high-resolution images of diverse medicinal plant species native to South India. The dataset integrates detailed taxonomic classifications and metadata to facilitate precise species identification and biodiversity analysis. Images were acquired under real-world conditions, considering variations in illumination, pose, and environmental factors to enhance dataset robustness. SIMPD is designed to support machine learning applications, particularly in image-based plant classification, object detection, and segmentation tasks. By providing an extensive dataset for AI-driven research, this work aims to bridge the gap between traditional ethnobotanical knowledge and modern computational methodologies, fostering advancements in medicinal plant classification, conservation, and ecological research.
药用植物的识别与分类对于植物学研究、传统医学以及人工智能驱动的应用至关重要。然而,缺乏标准化的高质量数据集限制了自动物种识别的进展。本研究引入了SIMPD版本1(南印度药用植物数据集),这是一个经过整理的数据集,包含南印度本土多种药用植物物种的高分辨率图像。该数据集整合了详细的分类学分类和元数据,以促进精确的物种识别和生物多样性分析。考虑到光照、姿态和环境因素的变化,在实际条件下采集图像,以增强数据集的鲁棒性。SIMPD旨在支持机器学习应用,特别是基于图像的植物分类、目标检测和分割任务。通过为人工智能驱动的研究提供广泛的数据集,这项工作旨在弥合传统民族植物学知识与现代计算方法之间的差距,促进药用植物分类、保护和生态研究的进展。