Suppr超能文献

用于评估作物种子及其他物体高光谱分类模型性能的实验数据处理

Experimental data manipulations to assess performance of hyperspectral classification models of crop seeds and other objects.

作者信息

Nansen Christian, Imtiaz Mohammad S, Mesgaran Mohsen B, Lee Hyoseok

机构信息

Department of Entomology and Nematology, University of California, Davis, USA.

Department of Entomology and Nematology, UC Davis Briggs Hall, Room 367, Davis, CA, 95616, USA.

出版信息

Plant Methods. 2022 Jun 3;18(1):74. doi: 10.1186/s13007-022-00912-z.

Abstract

BACKGROUND

Optical sensing solutions are being developed and adopted to classify a wide range of biological objects, including crop seeds. Performance assessment of optical classification models remains both a priority and a challenge.

METHODS

As training data, we acquired hyperspectral imaging data from 3646 individual tomato seeds (germination yes/no) from two tomato varieties. We performed three experimental data manipulations: (1) Object assignment error: effect of individual object in the training data being assigned to the wrong class. (2) Spectral repeatability: effect of introducing known ranges (0-10%) of stochastic noise to individual reflectance values. (3) Size of training data set: effect of reducing numbers of observations in training data. Effects of each of these experimental data manipulations were characterized and quantified based on classifications with two functions [linear discriminant analysis (LDA) and support vector machine (SVM)].

RESULTS

For both classification functions, accuracy decreased linearly in response to introduction of object assignment error and to experimental reduction of spectral repeatability. We also demonstrated that experimental reduction of training data by 20% had negligible effect on classification accuracy. LDA and SVM classification algorithms were applied to independent validation seed samples. LDA-based classifications predicted seed germination with RMSE = 10.56 (variety 1) and 26.15 (variety 2), and SVM-based classifications predicted seed germination with RMSE = 10.44 (variety 1) and 12.58 (variety 2).

CONCLUSION

We believe this study represents the first, in which optical seed classification included both a thorough performance evaluation of two separate classification functions based on experimental data manipulations, and application of classification models to validation seed samples not included in training data. Proposed experimental data manipulations are discussed in broader contexts and general relevance, and they are suggested as methods for in-depth performance assessments of optical classification models.

摘要

背景

光学传感解决方案正在被开发和应用于对包括作物种子在内的多种生物对象进行分类。光学分类模型的性能评估仍然是一个优先事项和挑战。

方法

作为训练数据,我们从两个番茄品种的3646颗单个番茄种子(发芽与否)中获取了高光谱成像数据。我们进行了三种实验数据处理:(1)对象分配错误:训练数据中单个对象被分配到错误类别的影响。(2)光谱重复性:向单个反射率值引入已知范围(0 - 10%)的随机噪声的影响。(3)训练数据集大小:减少训练数据中观测值数量的影响。基于两种函数[线性判别分析(LDA)和支持向量机(SVM)]的分类,对这些实验数据处理中的每一种的影响进行了表征和量化。

结果

对于这两种分类函数,随着对象分配错误的引入以及光谱重复性的实验性降低,准确率呈线性下降。我们还证明,将训练数据减少20%对分类准确率的影响可忽略不计。LDA和SVM分类算法被应用于独立的验证种子样本。基于LDA的分类预测种子发芽的均方根误差(RMSE)为10.56(品种1)和26.15(品种2),基于SVM的分类预测种子发芽的RMSE为10.44(品种1)和12.58(品种2)。

结论

我们认为这项研究是首次将光学种子分类既包括基于实验数据处理对两种单独分类函数进行全面性能评估,又包括将分类模型应用于未包含在训练数据中的验证种子样本。所提出的实验数据处理在更广泛的背景和普遍相关性中进行了讨论,并被建议作为光学分类模型深入性能评估的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5346/9164469/e698a1866608/13007_2022_912_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验