Priestle John P
Center for Proteomic Chemistry, Novartis Institutes for Biomedical Research, 4002 Basel, Switzerland.
J Mol Model. 2009 May;15(5):551-60. doi: 10.1007/s00894-008-0360-6. Epub 2008 Dec 16.
This report describes a computer program for clustering docking poses based on their 3-dimensional (3D) coordinates as well as on their chemical structures. This is chiefly intended for reducing a set of hits coming from high throughput docking, since the capacity to prepare and biologically test such molecules is generally far more limited than the capacity to generate such hits. The advantage of clustering molecules based on 3D, rather than 2D, criteria is that small variations on a scaffold may bring about different binding modes for molecules that would not be predicted by 2D similarity alone. The program does a pose-by-pose/atom-by-atom comparison of a set of docking hits (poses), scoring both spatial and chemical similarity. Using these pair-wise similarities, the whole set is clustered based on a user-supplied similarity threshold. An output coordinate file is created that mirrors the input coordinate file, but contains two new properties: a cluster number and similarity to the cluster center. Poses in this output file can easily be sorted by cluster and displayed together for visual inspection with any standard molecular viewing program, and decisions made about which molecule should be selected for biological testing as the best representative of this group of similar molecules with similar binding modes.
本报告描述了一个计算机程序,该程序可根据对接构象的三维(3D)坐标及其化学结构对其进行聚类。这主要用于减少高通量对接产生的一组命中结果,因为制备和对这类分子进行生物学测试的能力通常远低于生成此类命中结果的能力。基于3D而非2D标准对分子进行聚类的优势在于,支架上的微小变化可能会导致分子产生不同的结合模式,而仅靠2D相似性是无法预测这些模式的。该程序对一组对接命中结果(构象)进行逐个构象/逐个原子的比较,对空间和化学相似性进行评分。利用这些成对相似性,根据用户提供的相似性阈值对整个集合进行聚类。创建一个输出坐标文件,该文件反映输入坐标文件,但包含两个新属性:聚类编号和与聚类中心的相似性。此输出文件中的构象可以轻松地按聚类进行排序,并与任何标准分子查看程序一起显示以供目视检查,从而决定应选择哪个分子作为具有相似结合模式的这组相似分子的最佳代表进行生物学测试。