Suppr超能文献

基于多元变量的波数选择方法,用于将药品分类为真药或假药。

A multivariate-based wavenumber selection method for classifying medicines into authentic or counterfeit classes.

机构信息

Department of Industrial Engineering, Federal University of Rio Grande do Sul, 90035-190 Rio Grande do Sul, Brazil.

出版信息

J Pharm Biomed Anal. 2013 Sep;83:209-14. doi: 10.1016/j.jpba.2013.05.004. Epub 2013 May 24.

Abstract

Attenuated total reflectance (ATR), a sampling technique by Fourier transform infrared (FTIR) spectroscopy, has been adopted as an analytical tool for detecting fraudulent medicines. The spectrum generated by FTIR-ATR typically relies on hundreds of equally spaced wavenumbers which may reduce the performance of techniques tailored to classify samples into classes, i.e., authentic or fraudulent. This paper proposes a novel method for selecting subsets of wavenumbers (variables) that better classify samples into such classes. For that matter, principal components analysis (PCA) is integrated to the k-nearest neighbor (KNN) classification technique. PCA is applied to FTIR-ATR data, and a variable importance index is built on the PCA outputs. An iterative backward variable elimination is started guided by that index; after each variable removal, samples are categorized into authentic or fraudulent classes using KNN, and the classification accuracy is measured. The wavenumber subset compromising high accuracy and reduced percent of retained variables is chosen. When applied to Cialis FTIR-ATR data, the proposed approach retained only average 1.84% of the original variables and increased the classification accuracy average 2.1%, to 0.9897 from 0.9689; as for Viagra data, the method increased average classification accuracy 1.56%, from 0.9135 to 0.9278, using only 7.72% of the original variables.

摘要

衰减全反射(ATR)是傅里叶变换红外(FTIR)光谱学的一种采样技术,已被用作检测假药的分析工具。FTIR-ATR 生成的光谱通常依赖于数百个等间距的波数,这可能会降低针对将样品分类为正品或假冒品的技术的性能。本文提出了一种选择波数子集(变量)的新方法,以更好地将样品分类为这些类别。为此,将主成分分析(PCA)集成到 K 最近邻(KNN)分类技术中。将 PCA 应用于 FTIR-ATR 数据,并在 PCA 输出上构建变量重要性指数。根据该指数启动迭代向后变量消除;在每次删除变量后,使用 KNN 将样品分类为正品或假冒品,并测量分类准确性。选择具有高精度和低保留变量百分比的波数子集。当应用于 Cialis FTIR-ATR 数据时,所提出的方法仅保留了原始变量的平均 1.84%,并将分类准确性平均提高了 2.1%,从 0.9689 提高到 0.9897;对于 Viagra 数据,该方法仅使用原始变量的 7.72%,平均将分类准确性提高了 1.56%,从 0.9135 提高到 0.9278。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验