Santos Jherome Brylle Woody, Chen Lexin, Miranda-Quintana Ramón Alain
Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA.
bioRxiv. 2025 Jun 26:2025.06.20.660828. doi: 10.1101/2025.06.20.660828.
We present DIVIsive -ary Ensembles (DIVINE), a deterministic, top-down clustering framework designed for molecular dynamics (MD) trajectories. DIVINE constructs a complete clustering hierarchy by recursively splitting clusters based on -ary similarity principles, avoiding the need for O(N) pairwise distance matrices. It supports multiple cluster selection criteria, including a weighted variance metric, and deterministic anchor initialization strategies such as NANI (N-ary Natural Initiation), ensuring reproducible and well-balanced partitions. Testing DIVINE up to a 305 μs folding trajectory of the villin headpiece (HP35) revealed that it matched or exceeded the clustering quality of bisecting -means while reducing runtime and eliminating stochastic variability. Its single-pass design enables efficient exploration of clustering resolutions without repeated executions. By combining scalability, interpretability, and determinism, DIVINE offers a robust and practical alternative to conventional MD clustering methods. DIVINE is publicly available as part of the MDANCE package: https://github.com/mqcomplab/MDANCE.
我们提出了“分裂进制集成聚类法”(DIVINE),这是一种专为分子动力学(MD)轨迹设计的确定性自顶向下聚类框架。DIVINE通过基于进制相似性原则递归地分割聚类来构建完整的聚类层次结构,无需O(N)的成对距离矩阵。它支持多种聚类选择标准,包括加权方差度量,以及确定性锚点初始化策略,如NANI(N进制自然初始化),确保可重现且平衡良好的划分。对长达305微秒的维林头片段(HP35)折叠轨迹进行DIVINE测试表明,它在降低运行时间并消除随机变异性的同时,聚类质量达到或超过了二分K均值法。其单遍设计能够高效地探索聚类分辨率,而无需重复执行。通过结合可扩展性、可解释性和确定性,DIVINE为传统的MD聚类方法提供了一种强大而实用的替代方案。DIVINE作为MDANCE软件包的一部分可公开获取:https://github.com/mqcomplab/MDANCE 。