School of Physics, Universiti Sains Malaysia, 11800 Penang, Malaysia.
School of Physics, Universiti Sains Malaysia, 11800 Penang, Malaysia; School of Pharmaceutical Sciences, Universiti Sains Malaysia, 11800 Penang, Malaysia.
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Feb 5;266:120440. doi: 10.1016/j.saa.2021.120440. Epub 2021 Sep 27.
A proof-of-concept medicinal herbs identification scheme using machine learning classifiers is proposed in the form of an automated computational package. The scheme makes use of two-dimensional correlation Fourier Transformed Infrared (FTIR) fingerprinting maps derived from the FTIR of raw herb spectra as digital input. The prototype package admits a collection of 11 machine learning classifiers to form a voting pool. A common set of oversampled dataset containing 5 different herbal classes is used to train the pool of classifiers on a one-verses-others manner. The collections of trained models, dubbed the voting classifiers, are deployed in a collective manner to cast their votes to support or against a given inference fingerprint whether it belongs to a particular class. By collecting the votes casted by all voting classifiers, a logically designed scoring system will select out the most probable guess of the identity of the inference fingerprint. The same scoring system is also capable of discriminating an inference fingerprint that does not belong to any of the classes the voting classifiers are trained for as the 'others' type. The proposed classification scheme is stress-tested to evaluate its performance and expected consistency. Our experimental runs show that, by and large, a satisfactory performance of the classification scheme of up to 90 % accuracy is achieved, providing a proof-of-concept viability that the proposed scheme is a feasible, practical, and convenient tool for herbal classification. The scheme is implemented in the form of a packaged Python code, dubbed the "Collective Voting" (CV) package, which is easily scalable, maintained and used in practice.
提出了一种使用机器学习分类器的概念验证草药识别方案,以自动化计算程序包的形式呈现。该方案利用二维相关傅里叶变换红外(FTIR)指纹图谱,这些图谱源自原始草药光谱的 FTIR,作为数字输入。原型程序包允许使用 11 种机器学习分类器组成投票池。使用常见的过采样数据集,其中包含 5 种不同的草药类,对分类器池进行训练,采用一对一的方式。将收集到的经过训练的模型集合,称为投票分类器,以集体方式部署,以投票支持或反对给定的推断指纹是否属于特定类别。通过收集所有投票分类器投出的选票,一个逻辑设计的评分系统将选出最有可能猜测推断指纹身份的结果。相同的评分系统还能够区分投票分类器未针对其进行训练的任何类别的推断指纹,将其归类为“其他”类型。对所提出的分类方案进行压力测试,以评估其性能和预期的一致性。我们的实验结果表明,该分类方案的性能总体上令人满意,达到了 90%的准确率,证明了该方案是一种可行、实用且方便的草药分类工具。该方案以打包的 Python 代码形式实现,称为“集体投票”(CV)包,易于扩展、维护和实际应用。