使用时间序列生成模型高效生成用于质量评估的高效液相色谱（HPLC）和傅里叶变换红外光谱（FTIR）数据：以藏药希拉季特为例

Efficient generation of HPLC and FTIR data for quality assessment using time series generation model: a case study on Tibetan medicine Shilajit.

作者信息

Ding Rong, He Shiqi, Wu Xuemei, Zhong Liwen, Chen Guopeng, Gu Rui

机构信息

State Key Laboratory of Southwestern Chinese Medicine Resources, School of Ethnic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China.

School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China.

出版信息

Front Pharmacol. 2024 Nov 18;15:1503508. doi: 10.3389/fphar.2024.1503508. eCollection 2024.

DOI:10.3389/fphar.2024.1503508

PMID:39624838

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11608951/

Abstract

BACKGROUND

The scarcity and preciousness of plateau characteristic medicinal plants pose a significant challenge in obtaining sufficient quantities of experimental samples for quality evaluation. Insufficient sample sizes often lead to ambiguous and questionable quality assessments and suboptimal performance in pattern recognition. Shilajit, a popular Tibetan medicine, is harvested from high altitudes above 2000 m, making it difficult to obtain. Additionally, the complex geographical environment results in low uniformity of Shilajit quality.

METHODS

To address these challenges, this study employed a deep learning model, time vector quantization variational auto- encoder (TimeVQVAE), to generate data matrices based on chromatographic and spectral for different grades of Shilajit, thereby increasing in the amount of data. Partial least squares discriminant analysis (PLS-DA) was used to identify three grades of Shilajit samples based on original, generated, and combined data.

RESULTS

Compared with the originally generated high performance liquid chromatography (HPLC) and Fourier transform infrared spectroscopy (FTIR) data, the data generated by TimeVQVAE effectively preserved the chemical profile. In the test set, the average matrices for HPLC, FTIR, and combined data increased by 32.2%, 15.9%, and 23.0%, respectively. On the real test data, the PLS-DA model's classification accuracy initially reached a maximum of 0.7905. However, after incorporating TimeVQVAE-generated data, the accuracy significantly improved, reaching 0.9442 in the test set. Additionally, the PLS-DA model trained with the fused data showed enhanced stability.

CONCLUSION

This study offers a novel and effective approach for researching medicinal materials with small sample sizes, and addresses the limitations of improving model performance through data augmentation strategies.

摘要

背景

高原特色药用植物的稀缺性和珍贵性给获取足够数量的实验样品以进行质量评估带来了重大挑战。样本量不足往往导致质量评估模糊且不可靠，以及模式识别性能欠佳。希拉季特是一种广受欢迎的藏药，采自海拔2000米以上的高海拔地区，难以获取。此外，复杂的地理环境导致希拉季特质量的均匀性较低。

方法

为应对这些挑战，本研究采用深度学习模型——时间矢量量化变分自编码器（TimeVQVAE），基于不同等级希拉季特的色谱和光谱数据生成数据矩阵，从而增加数据量。偏最小二乘判别分析（PLS-DA）用于基于原始数据、生成数据和组合数据识别三个等级的希拉季特样本。

结果

与最初生成的高效液相色谱（HPLC）和傅里叶变换红外光谱（FTIR）数据相比，TimeVQVAE生成的数据有效地保留了化学特征。在测试集中，HPLC、FTIR和组合数据的平均矩阵分别增加了32.2%、15.9%和23.0%。在实际测试数据上，PLS-DA模型的分类准确率最初最高达到0.7905。然而，在纳入TimeVQVAE生成的数据后，准确率显著提高，在测试集中达到0.9442。此外，用融合数据训练的PLS-DA模型显示出更高的稳定性。

结论

本研究为小样本量药用材料的研究提供了一种新颖有效的方法，并解决了通过数据增强策略提高模型性能的局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa00/11608951/d1374aac847f/fphar-15-1503508-g001.jpg

相似文献

Efficient generation of HPLC and FTIR data for quality assessment using time series generation model: a case study on Tibetan medicine Shilajit.使用时间序列生成模型高效生成用于质量评估的高效液相色谱（HPLC）和傅里叶变换红外光谱（FTIR）数据：以藏药希拉季特为例

Front Pharmacol. 2024 Nov 18;15:1503508. doi: 10.3389/fphar.2024.1503508. eCollection 2024.

Rapid Identification of Medicinal Polygonatum Species and Predictive of Polysaccharides Using ATR-FTIR Spectroscopy Combined With Multivariate Analysis.采用衰减全反射傅里叶变换红外光谱（ATR-FTIR）结合多变量分析快速鉴定药用黄精属植物种类并预测多糖含量

Phytochem Anal. 2025 Apr;36(3):677-692. doi: 10.1002/pca.3459. Epub 2024 Oct 18.

Attenuated Total Reflection-Fourier Transform Infrared Spectroscopy (ATR-FTIR) Combined with Chemometrics Methods for the Classification of Lingzhi Species.衰减全反射傅里叶变换红外光谱（ATR-FTIR）结合化学计量学方法对灵芝物种的分类。

Molecules. 2019 Jun 13;24(12):2210. doi: 10.3390/molecules24122210.

A comparative study on classification of edible vegetable oils by infrared, near infrared and fluorescence spectroscopy combined with chemometrics.红外、近红外和荧光光谱结合化学计量学对食用植物油进行分类的比较研究。

Spectrochim Acta A Mol Biomol Spectrosc. 2023 Mar 5;288:122120. doi: 10.1016/j.saa.2022.122120. Epub 2022 Nov 17.

Attenuated Total Reflection-Fourier Transform Infrared (ATR-FTIR) Spectroscopy Analysis of Saliva as a Diagnostic Specimen for Rapid Classification of Oral Squamous Cell Carcinoma Using Chemometrics Methods.衰减全反射-傅里叶变换红外（ATR-FTIR）光谱分析唾液作为诊断标本，用于使用化学计量学方法快速分类口腔鳞状细胞癌。

Cancer Invest. 2024 Nov;42(10):815-826. doi: 10.1080/07357907.2024.2403086. Epub 2024 Oct 1.

[Data fusion and multi-components quantitative analysis for identification and quality evaluation of Gentiana rigescens from different geographical origins].[基于数据融合与多成分定量分析的不同产地滇龙胆鉴别及质量评价]

Zhongguo Zhong Yao Za Zhi. 2018 Mar;43(6):1162-1168. doi: 10.19540/j.cnki.cjcmm.20180105.016.

Comparison of data augmentation and classification algorithms based on plastic spectroscopy.基于塑料光谱学的数据增强与分类算法比较

Anal Methods. 2025 Feb 6;17(6):1236-1251. doi: 10.1039/d4ay01759e.

Authenticity identification and classification of Rhodiola species in traditional Tibetan medicine based on Fourier transform near-infrared spectroscopy and chemometrics analysis.基于傅里叶变换近红外光谱和化学计量学分析的藏药中红景天属物种的真实性鉴定和分类。

Spectrochim Acta A Mol Biomol Spectrosc. 2018 Nov 5;204:131-140. doi: 10.1016/j.saa.2018.06.004. Epub 2018 Jun 2.

Identification of geographical origins of Gastrodia elata Blume based on multisource data fusion.基于多源数据融合的天麻地理来源鉴定。

Phytochem Anal. 2024 Oct;35(7):1704-1716. doi: 10.1002/pca.3413. Epub 2024 Jun 27.

The Dynamic Accumulation Rules of Chemical Components during the Medicine Formation Period of and Chemometric Classifying Analysis for Different Bolting Times Using ATR-FTIR.基于衰减全反射傅里叶变换红外光谱法的不同打顶时期烟草化学成分形成期动态积累规律及化学计量分类分析

Molecules. 2023 Oct 27;28(21):7292. doi: 10.3390/molecules28217292.

本文引用的文献

Optimization of the selection of suitable harvesting periods for medicinal plants: taking Dendrobium officinale as an example.药用植物适宜采收期选择的优化：以铁皮石斛为例。

Plant Methods. 2024 Mar 16;20(1):43. doi: 10.1186/s13007-024-01172-9.

A Comprehensive Review on Shilajit: What We Know about Its Chemical Composition.关于希拉季特的全面综述：我们对其化学成分的了解

Crit Rev Anal Chem. 2025;55(3):461-473. doi: 10.1080/10408347.2023.2293963. Epub 2023 Dec 22.

Data fusion and multivariate analysis for food authenticity analysis.数据融合与多元分析在食品真实性分析中的应用。

Nat Commun. 2023 Jun 8;14(1):3309. doi: 10.1038/s41467-023-38382-z.

Authenticity and species identification of Fritillariae cirrhosae: a data fusion method combining electronic nose, electronic tongue, electronic eye and near infrared spectroscopy.川贝母的真伪鉴别与品种鉴定：一种融合电子鼻、电子舌、电子眼和近红外光谱的数据融合方法

Front Chem. 2023 Apr 28;11:1179039. doi: 10.3389/fchem.2023.1179039. eCollection 2023.

Quality assessment of traditional Chinese medicine based on data fusion combined with machine learning: A review.基于数据融合与机器学习的中药质量评价：综述。

Crit Rev Anal Chem. 2024;54(7):2618-2635. doi: 10.1080/10408347.2023.2189477. Epub 2023 Mar 26.

Mechanisms of generation and exudation of Tibetan medicine Shilajit (Zhaxun).藏药希拉季特（查训）的生成与渗出机制

Chin Med. 2020 Jun 29;15:65. doi: 10.1186/s13020-020-00343-9. eCollection 2020.

Traceability the provenience of cultivated Paris polyphylla Smith var. yunnanensis using ATR-FTIR spectroscopy combined with chemometrics.利用衰减全反射傅里叶变换红外光谱结合化学计量学对栽培重楼属云南重楼的产地进行溯源。

Spectrochim Acta A Mol Biomol Spectrosc. 2019 Apr 5;212:132-145. doi: 10.1016/j.saa.2019.01.008. Epub 2019 Jan 3.

The Human Skeletal Muscle Transcriptome in Response to Oral Shilajit Supplementation.口服希拉季特补充剂后人类骨骼肌转录组的变化

J Med Food. 2016 Jul;19(7):701-9. doi: 10.1089/jmf.2016.0010.

Olive oil sensory defects classification with data fusion of instrumental techniques and multivariate analysis (PLS-DA).采用仪器技术和多元分析（PLS-DA）的数据融合对橄榄油感官缺陷进行分类。

Food Chem. 2016 Jul 15;203:314-322. doi: 10.1016/j.foodchem.2016.02.038. Epub 2016 Feb 4.

Data fusion methodologies for food and beverage authentication and quality assessment - a review.食品和饮料鉴伪与质量评估的数据融合方法综述。

Anal Chim Acta. 2015 Sep 3;891:1-14. doi: 10.1016/j.aca.2015.04.042. Epub 2015 Apr 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用时间序列生成模型高效生成用于质量评估的高效液相色谱（HPLC）和傅里叶变换红外光谱（FTIR）数据：以藏药希拉季特为例

Efficient generation of HPLC and FTIR data for quality assessment using time series generation model: a case study on Tibetan medicine Shilajit.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献