Faculty of Electronics, Wroclaw University of Technology, Wroclaw Ludwika Pasteura 1, 50-367 Wroclaw, Poland.
Department of Chemistry of Drugs, Faculty of Pharmacy, Wroclaw Medical University, Wroclaw Ludwika Pasteura 1, 50-367 Wroclaw, Poland.
Sensors (Basel). 2019 Jul 30;19(15):3349. doi: 10.3390/s19153349.
In this study, we presented the concept and implementation of a fully functional system for the recognition of bi-heterocyclic compounds. We have conducted research into the application of machine learning methods to correctly recognize compounds based on THz spectra, and we have described the process of selecting optimal parameters for the kernel support vector machine (KSVM) with an additional `unknown' class. The chemical compounds used in the study contain a target molecule, used in pharmacy to combat inflammatory states formed in living organisms. Ready-made medical products with similar properties are commonly referred to as non-steroidal anti-inflammatory drugs (NSAIDs) once authorised on the pharmaceutical market. It was crucial to clearly determine whether the tested sample is a chemical compound known to researchers or is a completely new structure which should be additionally tested using other spectrometric methods. Our approach allows us to achieve 100% accuracy of the classification of the tested chemical compounds in the time of several milliseconds counted for 30 samples of the test set. It fits perfectly into the concept of rapid recognition of bi-heterocyclic compounds without the need to analyse the percentage composition of compound components, assuming that the sample is classified in a known group. The method allows us to minimize testing costs and significant reduction of the time of analysis.
在这项研究中,我们提出了一个用于识别双杂环化合物的全功能系统的概念和实现。我们研究了将机器学习方法应用于基于太赫兹光谱正确识别化合物的问题,并描述了为核支持向量机(KSVM)选择最佳参数的过程,其中还包括一个“未知”类。研究中使用的化学化合物包含一个靶分子,该分子用于对抗生物体中形成的炎症状态,在药剂学中使用。一旦在药品市场上获得批准,具有类似特性的现成医药产品通常被称为非甾体抗炎药(NSAIDs)。至关重要的是,要明确确定测试样品是研究人员已知的化学化合物,还是完全新的结构,应该使用其他光谱方法对其进行额外测试。我们的方法可以在几毫秒的时间内对 30 个测试集样本进行分类,实现对测试化学化合物 100%的分类准确性。它完全符合快速识别双杂环化合物的概念,而无需分析化合物成分的百分比组成,前提是假设样品被分类在已知组中。该方法允许我们最小化测试成本并显著减少分析时间。