Suppr超能文献

使用Ward方法的Székely-Rizzo推广对化学结构文件进行聚类。

Clustering files of chemical structures using the Székely-Rizzo generalization of Ward's method.

作者信息

Varin Thibault, Bureau Ronan, Mueller Christoph, Willett Peter

机构信息

Centre d'Etudes et de Recherche sur le Médicament de Normandie, UPRES EA4258, INC3M FR CNRS 3038, Université de Caen, Boulevard Becquerel, 14032 Caen Cedex, France.

出版信息

J Mol Graph Model. 2009 Sep;28(2):187-95. doi: 10.1016/j.jmgm.2009.06.006. Epub 2009 Jul 4.

Abstract

Ward's method is extensively used for clustering chemical structures represented by 2D fingerprints. This paper compares Ward clusterings of 14 datasets (containing between 278 and 4332 molecules) with those obtained using the Székely-Rizzo clustering method, a generalization of Ward's method. The clusters resulting from these two methods were evaluated by the extent to which the various classifications were able to group active molecules together, using a novel criterion of clustering effectiveness. Analysis of a total of 1400 classifications (Ward and Székely-Rizzo clustering methods, 14 different datasets, 5 different fingerprints and 10 different distance coefficients) demonstrated the general superiority of the Székely-Rizzo method. The distance coefficient first described by Soergel performed extremely well in these experiments, and this was also the case when it was used in simulated virtual screening experiments.

摘要

沃德方法被广泛用于对由二维指纹表示的化学结构进行聚类。本文将14个数据集(包含278至4332个分子)的沃德聚类与使用塞凯利 - 里佐聚类方法(沃德方法的一种推广)得到的聚类进行了比较。使用一种新的聚类有效性标准,通过各种分类将活性分子聚集在一起的程度来评估这两种方法产生的聚类。对总共1400种分类(沃德和塞凯利 - 里佐聚类方法、14个不同的数据集、5种不同的指纹和10个不同的距离系数)的分析表明塞凯利 - 里佐方法总体上更具优势。索尔格尔首次描述的距离系数在这些实验中表现极其出色,在模拟虚拟筛选实验中使用时也是如此。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验