Research Laboratory, U.S.E. Company, Limited , Tokyo 150-0013, Japan.
Anal Chem. 2017 Nov 21;89(22):11999-12005. doi: 10.1021/acs.analchem.7b02389. Epub 2017 Nov 7.
Gas chromatography/olfactometry (GC/O) has been used in various fields as a valuable method to identify odor-active components from a complex mixture. Since human assessors are employed as detectors to obtain the olfactory perception of separated odorants, the GC/O technique is limited by its subjectivity, variability, and high cost of the trained panelists. Here, we present a proof-of-concept model by which odor information can be obtained by machine-learning-based prediction from molecular parameters (MPs) of odorant molecules. The odor prediction models were established using a database of flavors and fragrances including 1026 odorants and corresponding verbal odor descriptors (ODs). Physicochemical parameters of the odorant molecules were acquired by use of molecular calculation software (DRAGON). Ten representative ODs were selected to build the prediction models based on their high frequency of occurrence in the database. The features of the MPs were extracted via either unsupervised (principal component analysis) or supervised (Boruta, BR) approaches and then used as input to calibrate machine-learning models. Predictions were performed by various machine-learning approaches such as support vector machine (SVM), random forest, and extreme learning machine. All models were optimized via parameter tuning and their prediction accuracies were compared. A SVM model combined with feature extraction by BR-C (confirmed only) was found to afford the best results with an accuracy of 97.08%. Validation of the models was verified by using the GC/O data of an apple sample for comparison between the predicted and measured results. The prediction models can be used as an auxiliary tool in the existing GC/O by suggesting possible OD candidates to the panelists and thus helping to give more objective and correct judgment. In addition, a machine-based GC/O in which the panelist is no longer needed might be expected after further development of the proposed odor prediction technique.
气相色谱/嗅闻法(GC/O)已在多个领域中作为一种从复杂混合物中鉴定气味活性成分的有价值方法得到应用。由于采用人类评估员作为检测器来获得分离气味剂的嗅觉感知,因此 GC/O 技术受到其主观性、可变性和经过训练的评估员小组的高成本的限制。在这里,我们提出了一个概念验证模型,通过该模型可以通过基于机器学习的预测从气味剂分子的分子参数(MPs)获得气味信息。使用包括 1026 种气味剂和相应的口头气味描述符(OD)的香精和香料数据库来建立气味预测模型。通过使用分子计算软件(DRAGON)获得气味分子的物理化学参数。选择了十个具有代表性的 OD 来基于它们在数据库中的高出现频率建立预测模型。通过无监督(主成分分析)或监督(Boruta,BR)方法提取 MPs 的特征,然后将其用作校准机器学习模型的输入。通过各种机器学习方法(例如支持向量机(SVM)、随机森林和极限学习机)进行预测。通过参数调整优化所有模型,并比较其预测精度。发现 SVM 模型与仅经 BR-C(经确认)提取特征的组合可提供最佳结果,准确率为 97.08%。通过使用 GC/O 数据对苹果样品进行比较来验证模型的验证,比较预测结果和测量结果。预测模型可以用作现有 GC/O 的辅助工具,通过向评估员建议可能的 OD 候选者,从而有助于做出更客观和正确的判断。此外,在提出的气味预测技术进一步发展之后,可能会期望开发出不需要评估员的基于机器的 GC/O。