Acheson Kyle, Kirrander Adam
EaStCHEM, School of Chemistry and Centre for Science at Extreme Conditions, University of Edinburgh, David Brewster Road, Edinburgh EH9 3FJ, U.K.
Department of Chemistry, University of Warwick, Coventry CV4 7AL, U.K.
J Chem Theory Comput. 2023 Sep 26;19(18):6126-6138. doi: 10.1021/acs.jctc.3c00776. Epub 2023 Sep 13.
We introduce automatic clustering as a computationally efficient tool for classifying and interpreting trajectories from simulations of photo-excited dynamics. Trajectories are treated as time-series data, with the features for clustering selected by variance mapping of normalized data. The L-norm and dynamic time warping are proposed as suitable similarity measures for calculating the distance matrices, and these are clustered using the unsupervised density-based DBSCAN algorithm. The silhouette coefficient and the number of trajectories classified as noise are used as quality measures for the clustering. The ability of clustering to provide rapid overview of large and complex trajectory data sets, and its utility for extracting chemical and physical insight, is demonstrated on trajectories corresponding to the photochemical ring-opening reaction of 1,3-cyclohexadiene, noting that the clustering can be used to generate reduced dimensionality representations in an unbiased manner.
我们引入自动聚类作为一种计算效率高的工具,用于对光激发动力学模拟中的轨迹进行分类和解释。轨迹被视为时间序列数据,通过对归一化数据进行方差映射来选择用于聚类的特征。提出使用L范数和动态时间规整作为计算距离矩阵的合适相似性度量,并使用基于密度的无监督DBSCAN算法对其进行聚类。轮廓系数和被分类为噪声的轨迹数量被用作聚类的质量度量。在与1,3-环己二烯光化学开环反应相对应的轨迹上,展示了聚类提供对大型复杂轨迹数据集快速概述的能力,以及其用于提取化学和物理见解的效用,同时指出聚类可用于以无偏的方式生成降维表示。