Suppr超能文献

将rcdk和cluster R软件包应用于候选药物的筛选。

The rcdk and cluster R packages applied to drug candidate selection.

作者信息

Voicu Adrian, Duteanu Narcis, Voicu Mirela, Vlad Daliborca, Dumitrascu Victor

机构信息

Department of Medical Informatics and Biostatistics, Victor Babes University of Medicine and Pharmacy, E. Murgu 2, 300041, Timisoara, Romania.

Dep. CAICAM, Politehnica University of Timisoara, Pirvan Boulevard 6, Timisoara, Romania.

出版信息

J Cheminform. 2020 Jan 20;12(1):3. doi: 10.1186/s13321-019-0405-0.

Abstract

The aim of this article is to show how thevpower of statistics and cheminformatics can be combined, in R, using two packages: rcdk and cluster.We describe the role of clustering methods for identifying similar structures in a group of 23 molecules according to their fingerprints. The most commonly used method is to group the molecules using a "score" obtained by measuring the average distance between them. This score reflects the similarity/non-similarity between compounds and helps us identify active or potentially toxic substances through predictive studies.Clustering is the process by which the common characteristics of a particular class of compounds are identified. For clustering applications, we are generally measure the molecular fingerprint similarity with the Tanimoto coefficient. Based on the molecular fingerprints, we calculated the molecular distances between the methotrexate molecule and the other 23 molecules in the group, and organized them into a matrix. According to the molecular distances and Ward 's method, the molecules were grouped into 3 clusters. We can presume structural similarity between the compounds and their locations in the cluster map. Because only 5 molecules were included in the methotrexate cluster, we considered that they might have similar properties and might be further tested as potential drug candidates.

摘要

本文的目的是展示如何在R语言中使用rcdk和cluster这两个软件包,将统计学和化学信息学的力量结合起来。我们描述了聚类方法在根据23个分子的指纹识别相似结构方面的作用。最常用的方法是使用通过测量分子间平均距离获得的 “分数” 对分子进行分组。该分数反映了化合物之间的相似性/非相似性,并通过预测性研究帮助我们识别活性或潜在有毒物质。聚类是识别特定类化合物共同特征的过程。对于聚类应用,我们通常用塔尼莫托系数来衡量分子指纹的相似性。基于分子指纹,我们计算了甲氨蝶呤分子与该组中其他23个分子之间的分子距离,并将它们整理成一个矩阵。根据分子距离和沃德方法,将分子分为3个簇。我们可以推测化合物之间的结构相似性以及它们在聚类图中的位置。由于甲氨蝶呤簇中只包含5个分子,我们认为它们可能具有相似的性质,可能作为潜在的候选药物进一步测试。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验