Hu Wenfeng, Tang Weihao, Li Chuang, Wu Jinjing, Liu Hong, Wang Chao, Luo Xiaochuan, Tang Rongnian
School of Mechanical and Electrical Engineering, Hainan University, Haikou 570228, China.
School of Electrical Engineering and Automation, Tianjin University, Tianjin 300072, China.
Plant Phenomics. 2024 Mar 22;6:0154. doi: 10.34133/plantphenomics.0154. eCollection 2024.
The nutritional status of rubber trees () is inseparable from the production of natural rubber. Nitrogen (N) and potassium (K) levels in rubber leaves are 2 crucial criteria that reflect the nutritional status of the rubber tree. Advanced hyperspectral technology can evaluate N and K statuses in leaves rapidly. However, high bias and uncertain results will be generated when using a small size and imbalance dataset to train a spectral estimaion model. A typical solution of laborious long-term nutrient stress and high-intensive data collection deviates from rapid and flexible advantages of hyperspectral tech. Therefore, a less intensive and streamlined method, remining information from hyperspectral image data, was assessed. From this new perspective, a semisupervised learning (SSL) method and resampling techniques were employed for generating pseudo-labeling data and class rebalancing. Subsequently, a 5-classification spectral model of the N and K statuses of rubber leaves was established. The SSL model based on random forest classifiers and mean sampling techniques yielded optimal classification results both on imbalance/balance dataset (weighted average precision 67.8/78.6%, macro averaged precision 61.2/74.4%, and weighted recall 65.7/78.5% for the N status). All data and code could be viewed on the:Github https://github.com/WeehowTang/SSL-rebalancingtest. Ultimately, we proposed an efficient way to rapidly and accurately monitor the N and K levels in rubber leaves, especially in the scenario of small annotation and imbalance categories ratios.
橡胶树的营养状况与天然橡胶的生产密不可分。橡胶树叶中的氮(N)和钾(K)含量是反映橡胶树营养状况的两个关键指标。先进的高光谱技术可以快速评估叶片中的氮和钾状况。然而,使用小尺寸且不平衡的数据集训练光谱估计模型时,会产生高偏差和不确定的结果。一种解决长期营养胁迫和高强度数据收集问题的典型方法,背离了高光谱技术快速灵活的优势。因此,评估了一种强度较低且简化的方法,即从高光谱图像数据中提取信息。从这个新角度出发,采用半监督学习(SSL)方法和重采样技术来生成伪标签数据和类别重新平衡。随后,建立了橡胶树叶氮和钾状况的五分类光谱模型。基于随机森林分类器和均值采样技术的SSL模型在不平衡/平衡数据集上均产生了最优分类结果(氮状况的加权平均精度为67.8/78.6%,宏观平均精度为61.2/74.4%,加权召回率为65.7/78.5%)。所有数据和代码可在Github https://github.com/WeehowTang/SSL-rebalancingtest上查看。最终,我们提出了一种高效方法,能够快速准确地监测橡胶树叶中的氮和钾含量,尤其是在小标注和不平衡类别比例的情况下。