Laboratorio de Química Computacional y Teórica, Facultad de Química , Universidad de La Habana , 10400 La Habana , Cuba.
Centro de Bioinformática y Simulación Molecular, Facultad de Ingeniería en Bioinformática , Universidad de Talca , 2 Norte 685, Casilla 721 , Talca , Chile.
J Chem Inf Model. 2020 Feb 24;60(2):467-472. doi: 10.1021/acs.jcim.9b00558. Epub 2019 Sep 27.
Clustering Molecular Dynamics trajectories is a common analysis that allows grouping together similar conformations. Several algorithms have been designed and optimized to perform this routine task, and among them, Quality Threshold stands as a very attractive option. This algorithm guarantees that in retrieved clusters no pair of frames will have a similarity value greater than a specified threshold, and hence, a set of strongly correlated frames are obtained for each cluster. In this work, it is shown that various commonly used software implementations are flawed by confusing Quality Threshold with another simplistic well-known clustering algorithm published by Daura et al. (Daura, X.; van Gunsteren, W. F.; Jaun, B.; Mark, A. E.; Gademann, K.; Seebach, D. Peptide Folding: When Simulation Meets Experiment. , (1/2), 236-240). Daura's algorithm does not impose any quality threshold for the frames contained in retrieved clusters, bringing unrelated structural configurations altogether. The advantages of using Quality Threshold whenever possible to explore Molecular Dynamic trajectories is exemplified. An in-house implementation of the original Quality Threshold algorithm has been developed in order to illustrate our comments, and its code is freely available for further use by the scientific community.
对分子动力学轨迹进行聚类是一种常见的分析方法,可将相似构象分组在一起。已经设计并优化了几种算法来执行此常规任务,其中质量阈值算法是一种非常有吸引力的选择。该算法保证在检索到的簇中,没有一对帧的相似度值大于指定的阈值,因此为每个簇获得一组强相关的帧。在这项工作中,表明各种常用的软件实现都存在缺陷,因为它们将质量阈值与 Daura 等人发表的另一种简单而知名的聚类算法混淆了。Daura 的算法对检索到的簇中包含的帧没有施加任何质量阈值,因此将不相关的结构构象全部包含在内。举例说明了在探索分子动力学轨迹时尽可能使用质量阈值的优势。为了说明我们的观点,开发了原始质量阈值算法的内部实现,并免费提供给科学界进一步使用。