Agarwal Avinash, de Jesus Colwell Filipe, Correa Galvis Viviana Andrea, Hill Tom R, Boonham Neil, Prashar Ankush
School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, UK.
Institute for Bio- and Geosciences: Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, Jülich, Germany.
Biol Methods Protoc. 2025 Apr 9;10(1):bpaf027. doi: 10.1093/biomethods/bpaf027. eCollection 2025.
Estimating pigment content of leafy vegetables via digital image analysis is a reliable method for high-throughput assessment of their nutritional value. However, the current leaf color analysis models developed using green-leaved plants fail to perform reliably while analyzing images of anthocyanin (Anth)-rich red-leaved varieties due to misleading or "red herring" trends. Hence, the present study explores the potential for machine learning (ML)-based estimation of nutritional pigment content for green and red leafy vegetables simultaneously using digital color features. For this, images of =320 samples from six types of leafy vegetables with varying pigment profiles were acquired using a smartphone camera, followed by extract-based estimation of chlorophyll (Chl), carotenoid (Car), and Anth. Subsequently, three ML methods, namely, Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Random Forest Regression (RFR), were tested for predicting pigment contents using RGB (Red, Green, Blue), HSV (Hue, Saturation, Value), and (Lightness, Redness-greenness, Yellowness-blueness) datasets individually and in combination. Chl and Car contents were predicted most accurately using the combined colorimetric dataset via SVR ( = 0.738) and RFR ( = 0.573), respectively. Conversely, Anth content was predicted most accurately using SVR with HSV data ( = 0.818). While Chl and Car could be predicted reliably for green-leaved and Anth-rich samples, Anth could be estimated accurately only for Anth-rich samples due to Anth masking by Chl in green-leaved samples. Thus, the present findings demonstrate the scope of implementing ML-based leaf color analysis for assessing the nutritional pigment content of red and green leafy vegetables in tandem.
通过数字图像分析估算叶菜类蔬菜的色素含量是高通量评估其营养价值的可靠方法。然而,目前使用绿叶植物开发的叶片颜色分析模型在分析富含花青素(Anth)的红叶品种图像时,由于存在误导性或“干扰”趋势,无法可靠地发挥作用。因此,本研究探索了利用数字颜色特征同时基于机器学习(ML)估算绿叶和红叶蔬菜营养色素含量的潜力。为此,使用智能手机摄像头采集了六种色素分布不同的叶菜类蔬菜的320个样本的图像,随后基于提取物估算叶绿素(Chl)、类胡萝卜素(Car)和Anth。随后,分别使用RGB(红、绿、蓝)、HSV(色调、饱和度、明度)和 (明度、红绿度、黄蓝度)数据集单独或组合测试了三种ML方法,即偏最小二乘回归(PLSR)、支持向量回归(SVR)和随机森林回归(RFR),以预测色素含量。通过SVR( = 0.738)和RFR( = 0.573)分别使用组合比色数据集最准确地预测了Chl和Car含量。相反,使用HSV数据的SVR最准确地预测了Anth含量( = 0.818)。虽然对于绿叶和富含Anth的样本可以可靠地预测Chl和Car,但由于绿叶样本中Chl对Anth的掩盖,仅对富含Anth的样本可以准确估计Anth。因此,本研究结果证明了实施基于ML的叶片颜色分析以同时评估红绿叶蔬菜营养色素含量的可行性。