Suppr超能文献

使用 UMAP 进行气味化合物分类,以增加对气味和分子结构联系的了解。

Smell compounds classification using UMAP to increase knowledge of odors and molecular structures linkages.

机构信息

T3S, Inserm UMR S-1124, Université de Paris, Paris, France.

Inserm U1133, CNRS UMR 8251, Université de Paris, Paris, France.

出版信息

PLoS One. 2021 May 28;16(5):e0252486. doi: 10.1371/journal.pone.0252486. eCollection 2021.

Abstract

This study aims to highlight the relationships between the structure of smell compounds and their odors. For this purpose, heterogeneous data sources were screened, and 6038 odorant compounds and their known associated odors (162 odor notes) were compiled, each individual molecule being represented with a set of 1024 structural fingerprint. Several dimensional reduction techniques (PCA, MDS, t-SNE and UMAP) with two clustering methods (k-means and agglomerative hierarchical clustering AHC) were assessed based on the calculated fingerprints. The combination of UMAP with k-means and AHC methods allowed to obtain a good representativeness of odors by clusters, as well as the best visualization of the proximity of odorants on the basis of their molecular structures. The presence or absence of molecular substructures has been calculated on odorant in order to link chemical groups to odors. The results of this analysis bring out some associations for both the odor notes and the chemical structures of the molecules such as "woody" and "spicy" notes with allylic and bicyclic structures, "balsamic" notes with unsaturated rings, both "sulfurous" and "citrus" with aldehydes, alcohols, carboxylic acids, amines and sulfur compounds, and "oily", "fatty" and "fruity" characterized by esters and with long carbon chains. Overall, the use of UMAP associated to clustering is a promising method to suggest hypotheses on the odorant structure-odor relationships.

摘要

本研究旨在强调气味化合物的结构与其气味之间的关系。为此,筛选了异构数据源,并编译了 6038 种气味化合物及其已知相关气味(162 种气味描述),每个分子都用一组 1024 个结构指纹表示。基于计算出的指纹,评估了几种降维技术(PCA、MDS、t-SNE 和 UMAP)和两种聚类方法(k-means 和凝聚层次聚类 AHC)。UMAP 与 k-means 和 AHC 方法的结合,使得通过聚类可以很好地代表气味,并且可以根据分子结构很好地可视化气味的接近程度。为了将化学基团与气味联系起来,计算了气味分子中是否存在分子亚结构。该分析的结果揭示了气味描述和分子结构之间的一些关联,例如“木质”和“辛辣”与烯丙基和双环结构有关,“香脂”与不饱和环有关,“硫磺”和“柑橘”都与醛、醇、羧酸、胺和硫化合物有关,“油性”、“脂肪”和“水果味”则与酯类和长链碳有关。总的来说,使用 UMAP 与聚类相结合是一种很有前途的方法,可以提出关于气味化合物结构-气味关系的假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64f2/8162648/10c66bc77b63/pone.0252486.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验