Suppr超能文献

一种结合套索回归和松弛套索回归的两步框架,用于解决中国白酒陈酿研究中的多维共线性问题。

A two-step framework integrating lasso and Relaxed Lasso for resolving multidimensional collinearity in Chinese baijiu aging research.

作者信息

An Dongyue, Wang Liangyan, He Jiang, Hua Yuejin

机构信息

MOE Key Laboratory of Biosystems Homeostasis and Protection, Institute of Biophysics, College of Life Sciences, Zhejiang University, Hangzhou, PR China.

Sichuan Institute of Atomic Energy, Irradiation Preservation Key Laboratory of Sichuan Province, Chengdu, 610101, PR China.

出版信息

Heliyon. 2024 Aug 27;10(17):e36871. doi: 10.1016/j.heliyon.2024.e36871. eCollection 2024 Sep 15.

Abstract

The aging process is crucial for Chinese Baijiu production, significantly enhancing the spirit's flavor, aroma and quality. However, aging involves a complex interplay of numerous compounds, and the extensive duration required for aging leads to a scarcity of samples available for scientific research. These limitations pose a challenge in analyzing high-dimensional data with collinearity, complicating the understanding of the intricate chemical processes at play. In this article, a two-step framework was proposed that integrated Relaxed Lasso regression models with Lasso-selected predictors to address this issue. Baijiu samples subjected to various aging conditions were analyzed using direct GC-MS and HS-GC-MS, and the obtained data was processed by this approach. The results demonstrate significantly superior performance compared to other methods, including PLSR and Gradient Boosting. Analyses were also performed on a previously documented dataset, yielding enhanced results and underscoring the method's advantage in processing high dimensional data with multicollinearity. Moreover, this method proved effective in screening of potential indicative compounds, highlighting its utility in Baijiu aging research.

摘要

陈酿过程对中国白酒生产至关重要,能显著提升白酒的风味、香气和品质。然而,陈酿涉及众多化合物的复杂相互作用,且陈酿所需的漫长时间导致用于科学研究的样本稀缺。这些限制给分析具有共线性的高维数据带来了挑战,使得理解其中复杂的化学过程变得困难。本文提出了一个两步框架,将松弛套索回归模型与套索选择的预测变量相结合来解决这一问题。使用直接气相色谱 - 质谱联用(GC-MS)和顶空 - 气相色谱 - 质谱联用(HS-GC-MS)对处于各种陈酿条件下的白酒样本进行分析,并采用该方法对所得数据进行处理。结果表明,与其他方法(包括偏最小二乘回归(PLSR)和梯度提升法)相比,该方法具有显著优越的性能。还对一个先前记录的数据集进行了分析,得到了更好的结果,并突出了该方法在处理具有多重共线性的高维数据方面的优势。此外,该方法在筛选潜在指示性化合物方面被证明是有效的,凸显了其在白酒陈酿研究中的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a5e/11399590/190a1f90c80d/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验