Tovey Samuel, Zills Fabian, Torres-Herrador Francisco, Lohrmann Christoph, Brückner Marco, Holm Christian
Institute for Computational Physics, Universität Stuttgart, Stuttgart, Germany.
Aeronautics and Aerospace Department, von Karman Institute for Fluid Dynamics, Rhode-St-Genese, Belgium.
J Cheminform. 2023 Feb 11;15(1):19. doi: 10.1186/s13321-023-00687-y.
Particle-Based (PB) simulations, including Molecular Dynamics (MD), provide access to system observables that are not easily available experimentally. However, in most cases, PB data needs to be processed after a simulation to extract these observables. One of the main challenges in post-processing PB simulations is managing the large amounts of data typically generated without incurring memory or computational capacity limitations. In this work, we introduce the post-processing tool: MDSuite. This software, developed in Python, combines state-of-the-art computing technologies such as TensorFlow, with modern data management tools such as HDF5 and SQL for a fast, scalable, and accurate PB data processing engine. This package, built around the principles of FAIR data, provides a memory safe, parallelized, and GPU accelerated environment for the analysis of particle simulations. The software currently offers 17 calculators for the computation of properties including diffusion coefficients, thermal conductivity, viscosity, radial distribution functions, coordination numbers, and more. Further, the object-oriented framework allows for the rapid implementation of new calculators or file-readers for different simulation software. The Python front-end provides a familiar interface for many users in the scientific community and a mild learning curve for the inexperienced. Future developments will include the introduction of more analysis associated with ab-initio methods, colloidal/macroscopic particle methods, and extension to experimental data.
基于粒子(PB)的模拟,包括分子动力学(MD),能够获取实验中不易获得的系统可观测量。然而,在大多数情况下,PB数据在模拟后需要进行处理以提取这些可观测量。后处理PB模拟的主要挑战之一是管理通常生成的大量数据,同时又不会导致内存或计算能力受限。在这项工作中,我们介绍了后处理工具:MDSuite。这个用Python开发的软件,将诸如TensorFlow等最先进的计算技术与诸如HDF5和SQL等现代数据管理工具相结合,打造了一个快速、可扩展且准确的PB数据处理引擎。这个围绕FAIR数据原则构建的软件包,为粒子模拟分析提供了一个内存安全、并行化且GPU加速的环境。该软件目前提供17种用于计算属性的计算器,包括扩散系数、热导率、粘度、径向分布函数、配位数等等。此外,面向对象框架允许为不同的模拟软件快速实现新的计算器或文件读取器。Python前端为科学界的许多用户提供了一个熟悉的界面,对于没有经验的用户来说学习曲线也较为平缓。未来的发展将包括引入更多与从头算方法、胶体/宏观粒子方法相关的分析,并扩展到实验数据。