Suppr超能文献

基于激发发射矩阵结合 t-SNE 降维的大豆化学特性分类。

Classification of soybean chemical characteristics by excitation emission matrix coupled with t-SNE dimensionality reduction.

机构信息

Institute of Science and Technology, Niigata University, 8050 2-no-cho, Ikarashi, Nishi-ku, Niigata 950-2181, Japan.

ImVisionLabs Inc., 7-3-1, Hongo, Bunkyo-ku, Tokyo 113-8485, Japan.

出版信息

Spectrochim Acta A Mol Biomol Spectrosc. 2024 Dec 5;322:124785. doi: 10.1016/j.saa.2024.124785. Epub 2024 Jul 4.

Abstract

Measuring the chemical composition in soybeans is time-consuming and laborious, and even simple near-infrared sensors generally require the creation of calibration curves before application. In this study, a new screening method for soybeans without calibration curves was investigated by combining the excitation emission matrix (EEM) and dimensionality reduction analysis. The EEMs of 34 soybean samples were measured, and representative chemical contents including crude protein, crude oil and isoflavone contents were measured by chemical analysis. Two methods of dimensionality reduction: principal component analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) were applied on the EEM data to obtain two-dimensional plots, which were divided into two regions with large or small amount of each chemical components. To classify the large or small levels of each of the chemical composition, machine learning classification models were constructed on the two-dimensional plots after dimensionality reduction. As a result, the classification accuracy was higher in t-SNE than in the combinations of PC1 and PC2 from PCA. Furthermore, in t-SNE, the classification accuracy reached over 90% for all the chemical components. From these results, t-SNE dimensionality reduction on the soybean EEM has the potential for easy and accurate screening of soybeans especially based on isoflavone contents.

摘要

测定大豆中的化学成分既耗时又费力,即使是简单的近红外传感器,通常也需要在应用前创建校准曲线。本研究通过将激发发射矩阵(EEM)和降维分析相结合,探讨了一种无需校准曲线的大豆新筛选方法。对 34 个大豆样品的 EEM 进行了测量,并通过化学分析测量了包括粗蛋白、粗油和异黄酮含量在内的代表性化学含量。应用主成分分析(PCA)和 t 分布随机邻嵌入(t-SNE)两种降维方法对 EEM 数据进行处理,得到二维图谱,将其分为两种化学物质含量高或低的区域。为了对每种化学成分的高低水平进行分类,在降维后的二维图谱上构建了机器学习分类模型。结果表明,在 t-SNE 中,所有化学成分的分类准确率均高于 PCA 中 PC1 和 PC2 的组合。此外,在 t-SNE 中,所有化学成分的分类准确率均超过 90%。由此可见,基于大豆 EEM 的 t-SNE 降维方法有望实现大豆的简便、准确筛选,特别是基于异黄酮含量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验