遗传算法在中红外光谱数据无监督特征选择中的应用。

Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

机构信息

Université de Strasbourg (Unistra), Institut National de la Santé et de la Recherche Médicale, IRFAC Inserm U1113, 3 Avenue Molière, 67200Strasbourg, France.

Université de Reims Champagne-Ardenne, BioSpecT EA 7506, 51 Rue Cognacq-Jay, 51097Reims, France.

出版信息

Anal Chem. 2022 Nov 22;94(46):16050-16059. doi: 10.1021/acs.analchem.2c03118. Epub 2022 Nov 8.

DOI:10.1021/acs.analchem.2c03118

PMID:36346912

Abstract

Dimensional reduction of highly multidimensional datasets such as those acquired by Fourier transform infrared spectroscopy (FTIR) is a critical step in the data analysis workflow. To achieve this goal, numerous feature selection methods have been developed and applied in a supervised context, i.e., using a priori knowledge about data usually in the form of labels for classification or quantitative values for regression. For this, genetic algorithms have been largely exploited due to their flexibility and global optimization principle. However, few applications in an unsupervised context have been reported in infrared spectroscopy. The aim of this article is to propose a new unsupervised feature selection method based on a genetic algorithm using a validity index computed from KMeans partitions as a fitness function. Evaluated on a simulated dataset and validated and tested on three real-world infrared spectroscopic datasets, our developed algorithm is able to find the spectral descriptors improving clustering accuracy and simplifying the spectral interpretation of results.

摘要

高维多维数据集（如傅里叶变换红外光谱（FTIR）获得的数据集）的降维是数据分析工作流程中的关键步骤。为了实现这一目标，已经开发并应用了许多特征选择方法，这些方法在有监督的情况下使用，即使用有关数据的先验知识，通常以分类的标签或回归的定量值的形式。为此，由于其灵活性和全局优化原理，遗传算法被广泛利用。然而，在红外光谱学中，很少有报道在无监督的情况下应用。本文的目的是提出一种新的基于遗传算法的无监督特征选择方法，该方法使用从 KMeans 分区计算的有效性指数作为适应度函数。在模拟数据集上进行评估，并在三个真实的红外光谱数据集上进行验证和测试，我们开发的算法能够找到改善聚类精度和简化结果光谱解释的光谱描述符。

相似文献

Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

Anal Chem. 2022 Nov 22;94(46):16050-16059. doi: 10.1021/acs.analchem.2c03118. Epub 2022 Nov 8.

Development of a memetic clustering algorithm for optimal spectral histology: application to FTIR images of normal human colon.

Analyst. 2016 May 23;141(11):3296-304. doi: 10.1039/c5an02227d.

Automatic Identification of Paraffin Pixels on FTIR Images Acquired on FFPE Human Samples.

Anal Chem. 2021 Mar 2;93(8):3750-3761. doi: 10.1021/acs.analchem.0c03910. Epub 2021 Feb 16.

Multi-modal image sharpening in fourier transform infrared (FTIR) microscopy.

Analyst. 2021 Aug 7;146(15):4822-4834. doi: 10.1039/d1an00103e. Epub 2021 Jul 1.

Selection of discriminant mid-infrared wavenumbers by combining a naïve Bayesian classifier and a genetic algorithm: Application to the evaluation of lignocellulosic biomass biodegradation.

Math Biosci. 2017 Jul;289:153-161. doi: 10.1016/j.mbs.2017.05.002. Epub 2017 May 13.

Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data.

Anal Chim Acta. 2011 Apr 29;692(1-2):63-72. doi: 10.1016/j.aca.2011.03.006. Epub 2011 Mar 8.

Balanced Spectral Feature Selection.

IEEE Trans Cybern. 2023 Jul;53(7):4232-4244. doi: 10.1109/TCYB.2022.3160244. Epub 2023 Jun 15.

Selecting optimal features from Fourier transform infrared spectroscopy for discrete-frequency imaging.

Analyst. 2018 Feb 26;143(5):1147-1156. doi: 10.1039/c7an01888f.

A pilot study on fingerprinting Leishmania species from the Old World using Fourier transform infrared spectroscopy.

Anal Bioanal Chem. 2017 Nov;409(29):6907-6923. doi: 10.1007/s00216-017-0655-5. Epub 2017 Oct 28.

Firefly as a novel swarm intelligence variable selection method in spectroscopy.

Anal Chim Acta. 2014 Dec 10;852:20-7. doi: 10.1016/j.aca.2014.09.045. Epub 2014 Sep 28.

引用本文的文献

Automated Machine-Learning-Driven Analysis of Microplastics by TGA-FTIR for Enhanced Identification and Quantification.

Anal Chem. 2025 Apr 29;97(16):8833-8840. doi: 10.1021/acs.analchem.4c06775. Epub 2025 Apr 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

遗传算法在中红外光谱数据无监督特征选择中的应用。

Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

机构信息

Université de Strasbourg (Unistra), Institut National de la Santé et de la Recherche Médicale, IRFAC Inserm U1113, 3 Avenue Molière, 67200Strasbourg, France.

Université de Reims Champagne-Ardenne, BioSpecT EA 7506, 51 Rue Cognacq-Jay, 51097Reims, France.

出版信息

Anal Chem. 2022 Nov 22;94(46):16050-16059. doi: 10.1021/acs.analchem.2c03118. Epub 2022 Nov 8.

DOI:10.1021/acs.analchem.2c03118

PMID:36346912

Abstract

摘要

遗传算法在中红外光谱数据无监督特征选择中的应用。

Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

遗传算法在中红外光谱数据无监督特征选择中的应用。

Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

机构信息

出版信息

相似文献

引用本文的文献