Panwar Pawan, Yang Quanpeng, Martini Ashlie
Department of Mechanical Engineering, University of California Merced, 5200 North Lake Road, Merced, CA, 95343, USA.
J Cheminform. 2023 Jul 28;15(1):69. doi: 10.1186/s13321-023-00737-5.
Molecular descriptors characterize the biological, physical, and chemical properties of molecules and have long been used for understanding molecular interactions and facilitating materials design. Some of the most robust descriptors are derived from geometrical representations of molecules, called 3-dimensional (3D) descriptors. When calculated from molecular dynamics (MD) simulation trajectories, 3D descriptors can also capture the effects of operating conditions such as temperature or pressure. However, extracting 3D descriptors from MD trajectories is non-trivial, which hinders their wide use by researchers developing advanced quantitative-structure-property-relationship models using machine learning. Here, we describe a suite of open-source Python-based post-processing routines, called PyL3dMD, for calculating 3D descriptors from MD simulations. PyL3dMD is compatible with the popular simulation package LAMMPS and enables users to compute more than 2000 3D molecular descriptors from atomic trajectories generated by MD simulations. PyL3dMD is freely available via GitHub and can be easily installed and used as a highly flexible Python package on all major platforms (Windows, Linux, and macOS). A performance benchmark study used descriptors calculated by PyL3dMD to develop a neural network and the results showed that PyL3dMD is fast and efficient in calculating descriptors for large and complex molecular systems with long simulation durations. PyL3dMD facilitates the calculation of 3D molecular descriptors using MD simulations, making it a valuable tool for cheminformatics studies.
分子描述符表征分子的生物学、物理和化学性质,长期以来一直用于理解分子间相互作用并推动材料设计。一些最可靠的描述符源自分子的几何表示,称为三维(3D)描述符。当从分子动力学(MD)模拟轨迹计算时,3D描述符还可以捕捉温度或压力等操作条件的影响。然而,从MD轨迹中提取3D描述符并非易事,这阻碍了使用机器学习开发先进定量结构-性质关系模型的研究人员广泛使用它们。在此,我们描述了一套基于Python的开源后处理程序,称为PyL3dMD,用于从MD模拟中计算3D描述符。PyL3dMD与流行的模拟软件包LAMMPS兼容,使用户能够从MD模拟生成的原子轨迹中计算2000多个3D分子描述符。PyL3dMD可通过GitHub免费获取,并且可以轻松安装并作为高度灵活的Python包在所有主流平台(Windows、Linux和macOS)上使用。一项性能基准研究使用PyL3dMD计算的描述符开发了一个神经网络,结果表明PyL3dMD在为具有长模拟时长的大型复杂分子系统计算描述符时快速且高效。PyL3dMD便于使用MD模拟计算3D分子描述符,使其成为化学信息学研究的宝贵工具。