Suppr超能文献

Nearl:从分子动力学轨迹中提取用于机器学习任务的动态特征。

Nearl: extracting dynamic features from molecular dynamics trajectories for machine learning tasks.

作者信息

Zhang Yang, Vitalis Andreas

机构信息

Department of Biochemistry, University of Zurich, Zurich, 8057, Switzerland.

出版信息

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf321.

Abstract

SUMMARY

Despite the rapid growth of machine learning in biomolecular applications, information about protein dynamics is underutilized. Here, we introduce Nearl, an automated pipeline designed to extract dynamic features from large ensembles of molecular dynamics trajectories. Nearl aims to identify intrinsic patterns of molecular motion and to provide informative features for predictive modeling tasks. We implement two classes of dynamic features, termed marching observers and property-density flow, to capture local atomic motions while maintaining a view of the global configuration. Complemented by standard voxelization techniques, Nearl transforms substructures of proteins into three-dimensional (3D) grids, suitable for contemporary 3D convolutional neural networks (3D-CNNs). The pipeline leverages graphics processing unit (GPU) acceleration, adheres to the FAIR principles for research software, and prioritizes flexibility and user-friendliness, allowing customization of input formats and feature extraction.

AVAILABILITY AND IMPLEMENTATION

The source code of Nearl is hosted at https://github.com/miemiemmmm/Nearl and archived at https://doi.org/10.5281/zenodo.15320286. The documentation is hosted on ReadTheDocs at https://nearl.readthedocs.io/en/latest/. All pre-built models are implemented in PyTorch and available on GitHub.

摘要

摘要

尽管机器学习在生物分子应用中迅速发展,但蛋白质动力学信息仍未得到充分利用。在此,我们介绍Nearl,这是一个自动化流程,旨在从大量分子动力学轨迹中提取动态特征。Nearl旨在识别分子运动的内在模式,并为预测建模任务提供信息丰富的特征。我们实现了两类动态特征,称为行进观测器和属性密度流,以捕捉局部原子运动,同时保持对全局构型的观察。通过标准体素化技术的补充,Nearl将蛋白质的子结构转换为三维(3D)网格,适用于当代三维卷积神经网络(3D-CNN)。该流程利用图形处理单元(GPU)加速,遵循研究软件的FAIR原则,并优先考虑灵活性和用户友好性,允许定制输入格式和特征提取。

可用性与实现

Nearl的源代码托管在https://github.com/miemiemmmm/Nearl,并在https://doi.org/10.5281/zenodo.15320286存档。文档托管在ReadTheDocs上,网址为https://nearl.readthedocs.io/en/latest/。所有预构建模型均在PyTorch中实现,并可在GitHub上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646d/12233089/72c732142606/btaf321f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验