Schicker Doris, Singh Satnam, Freiherr Jessica, Grasskamp Andreas T
Sensory Analytics and Technologies, Fraunhofer Institute for Process Engineering and Packaging IVV, Giggenhauser Straße 35, 85354, Freising, Germany.
Department of Psychiatry and Psychotherapy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Schwabachanlage 6, 91054, Erlangen, Germany.
J Cheminform. 2023 May 7;15(1):51. doi: 10.1186/s13321-023-00722-y.
We derived and implemented a linear classification algorithm for the prediction of a molecule's odor, called Olfactory Weighted Sum (OWSum). Our approach relies solely on structural patterns of the molecules as features for algorithmic treatment and uses conditional probabilities combined with tf-idf values. In addition to the prediction of molecular odor, OWSum provides insights into properties of the dataset and allows to understand how algorithmic classifications are reached by quantitatively assigning structural patterns to odors. This provides chemists with an intuitive understanding of underlying interactions. To deal with ambiguities of the natural language used to describe odor, we introduced descriptor overlap as a metric for the quantification of semantic overlap between descriptors. Thus, grouping of descriptors and derivation of higher-level descriptors becomes possible. Our approach poses a large leap forward in our capabilities to understand and predict molecular features.
我们推导并实现了一种用于预测分子气味的线性分类算法,称为嗅觉加权和(OWSum)。我们的方法仅依赖于分子的结构模式作为算法处理的特征,并使用条件概率与词频-逆文档频率(tf-idf)值相结合。除了预测分子气味外,OWSum还能深入了解数据集的属性,并通过将结构模式定量地分配给气味来理解算法分类是如何达成的。这为化学家提供了对潜在相互作用的直观理解。为了处理用于描述气味的自然语言的模糊性,我们引入了描述符重叠作为量化描述符之间语义重叠的度量。因此,描述符的分组和更高级别描述符的推导成为可能。我们的方法在理解和预测分子特征的能力方面向前迈出了一大步。