• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Nearl:从分子动力学轨迹中提取用于机器学习任务的动态特征。

Nearl: extracting dynamic features from molecular dynamics trajectories for machine learning tasks.

作者信息

Zhang Yang, Vitalis Andreas

机构信息

Department of Biochemistry, University of Zurich, Zurich, 8057, Switzerland.

出版信息

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf321.

DOI:10.1093/bioinformatics/btaf321
PMID:40439145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12233089/
Abstract

SUMMARY

Despite the rapid growth of machine learning in biomolecular applications, information about protein dynamics is underutilized. Here, we introduce Nearl, an automated pipeline designed to extract dynamic features from large ensembles of molecular dynamics trajectories. Nearl aims to identify intrinsic patterns of molecular motion and to provide informative features for predictive modeling tasks. We implement two classes of dynamic features, termed marching observers and property-density flow, to capture local atomic motions while maintaining a view of the global configuration. Complemented by standard voxelization techniques, Nearl transforms substructures of proteins into three-dimensional (3D) grids, suitable for contemporary 3D convolutional neural networks (3D-CNNs). The pipeline leverages graphics processing unit (GPU) acceleration, adheres to the FAIR principles for research software, and prioritizes flexibility and user-friendliness, allowing customization of input formats and feature extraction.

AVAILABILITY AND IMPLEMENTATION

The source code of Nearl is hosted at https://github.com/miemiemmmm/Nearl and archived at https://doi.org/10.5281/zenodo.15320286. The documentation is hosted on ReadTheDocs at https://nearl.readthedocs.io/en/latest/. All pre-built models are implemented in PyTorch and available on GitHub.

摘要

摘要

尽管机器学习在生物分子应用中迅速发展,但蛋白质动力学信息仍未得到充分利用。在此,我们介绍Nearl,这是一个自动化流程,旨在从大量分子动力学轨迹中提取动态特征。Nearl旨在识别分子运动的内在模式,并为预测建模任务提供信息丰富的特征。我们实现了两类动态特征,称为行进观测器和属性密度流,以捕捉局部原子运动,同时保持对全局构型的观察。通过标准体素化技术的补充,Nearl将蛋白质的子结构转换为三维(3D)网格,适用于当代三维卷积神经网络(3D-CNN)。该流程利用图形处理单元(GPU)加速,遵循研究软件的FAIR原则,并优先考虑灵活性和用户友好性,允许定制输入格式和特征提取。

可用性与实现

Nearl的源代码托管在https://github.com/miemiemmmm/Nearl,并在https://doi.org/10.5281/zenodo.15320286存档。文档托管在ReadTheDocs上,网址为https://nearl.readthedocs.io/en/latest/。所有预构建模型均在PyTorch中实现,并可在GitHub上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646d/12233089/72c732142606/btaf321f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646d/12233089/72c732142606/btaf321f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646d/12233089/72c732142606/btaf321f1.jpg

相似文献

1
Nearl: extracting dynamic features from molecular dynamics trajectories for machine learning tasks.Nearl:从分子动力学轨迹中提取用于机器学习任务的动态特征。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf321.
2
Multi-stage attention-based extraction and fusion of protein sequence and structural features for protein function prediction.基于多阶段注意力机制的蛋白质序列与结构特征提取及融合用于蛋白质功能预测
Bioinformatics. 2025 Jun 26. doi: 10.1093/bioinformatics/btaf374.
3
Multi-objective context-guided consensus of a massive array of techniques for the inference of Gene Regulatory Networks.大规模技术的多目标上下文引导共识,用于基因调控网络推断。
Comput Biol Med. 2024 Sep;179:108850. doi: 10.1016/j.compbiomed.2024.108850. Epub 2024 Jul 15.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
CoReSi: a GPU-based software for Compton camera reconstruction and simulation in collimator-free SPECT.CoReSi:一款用于无准直器单光子发射计算机断层扫描中康普顿相机重建与模拟的基于图形处理器的软件。
Phys Med Biol. 2025 Jan 31;70(4). doi: 10.1088/1361-6560/adaacc.
6
dsOMOP: Bridging OMOP CDM and DataSHIELD for Secure Federated Analysis of Standardized Clinical Data.dsOMOP:连接OMOP通用数据模型(CDM)和DataSHIELD以实现标准化临床数据的安全联合分析
Bioinformatics. 2025 May 6. doi: 10.1093/bioinformatics/btaf286.
7
Algebraic differentiation for fast sensitivity analysis of optimal flux modes in metabolic models.用于代谢模型中最优通量模式快速灵敏度分析的代数微分法。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf287.
8
DeepAllo: allosteric site prediction using protein language model (pLM) with multitask learning.DeepAllo:使用具有多任务学习的蛋白质语言模型(pLM)进行变构位点预测。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf294.
9
GRAPEVNE - Graphical Analytical Pipeline Development Environment for Infectious Diseases.GRAPEVNE - 传染病图形分析管道开发环境
Wellcome Open Res. 2025 May 27;10:279. doi: 10.12688/wellcomeopenres.23824.1. eCollection 2025.
10
Pool PaRTI: a PageRank-based pooling method for identifying critical residues and enhancing protein sequence representations.Pool PaRTI:一种基于PageRank的池化方法,用于识别关键残基并增强蛋白质序列表示。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf330.

本文引用的文献

1
Benchmarking the robustness of the correct identification of flexible 3D objects using common machine learning models.使用常见机器学习模型对灵活3D物体正确识别的稳健性进行基准测试。
Patterns (N Y). 2025 Jan 10;6(1):101147. doi: 10.1016/j.patter.2024.101147.
2
MISATO: machine learning dataset of protein-ligand complexes for structure-based drug discovery.MISATO:基于结构的药物发现的蛋白质-配体复合物的机器学习数据集。
Nat Comput Sci. 2024 May;4(5):367-378. doi: 10.1038/s43588-024-00627-2. Epub 2024 May 10.
3
GPCR-IPL score: multilevel featurization of GPCR-ligand interaction patterns and prediction of ligand functions from selectivity to biased activation.
GPCR-IPL 评分:从选择性到偏激活的配体功能预测,对 GPCR-配体相互作用模式进行多层次特征化。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae105.
4
An Unsupervised Machine Learning Approach for the Automatic Construction of Local Chemical Descriptors.一种用于自动构建局部化学描述符的无监督机器学习方法。
J Chem Inf Model. 2024 Apr 22;64(8):3059-3079. doi: 10.1021/acs.jcim.3c01906. Epub 2024 Mar 18.
5
ATLAS: protein flexibility description from atomistic molecular dynamics simulations.ATLAS:原子分子动力学模拟中的蛋白质柔性描述。
Nucleic Acids Res. 2024 Jan 5;52(D1):D384-D392. doi: 10.1093/nar/gkad1084.
6
A new paradigm for molecular dynamics databases: the COVID-19 database, the legacy of a titanic community effort.一个新的分子动力学数据库范例:COVID-19 数据库,是一个庞大社区努力的遗产。
Nucleic Acids Res. 2024 Jan 5;52(D1):D393-D403. doi: 10.1093/nar/gkad991.
7
3DDPDs: describing protein dynamics for proteochemometric bioactivity prediction. A case for (mutant) G protein-coupled receptors.3DDPDs:用于蛋白质化学计量生物活性预测的蛋白质动力学描述。以(突变型)G蛋白偶联受体为例。
J Cheminform. 2023 Aug 28;15(1):74. doi: 10.1186/s13321-023-00745-5.
8
Uncertainties in Markov State Models of Small Proteins.小蛋白的马尔可夫状态模型中的不确定性。
J Chem Theory Comput. 2023 Aug 22;19(16):5516-5524. doi: 10.1021/acs.jctc.3c00372. Epub 2023 Aug 4.
9
nanoNET: machine learning platform for predicting nanoparticles distribution in a polymer matrix.纳米网络:用于预测纳米颗粒在聚合物基质中分布的机器学习平台。
Soft Matter. 2023 Jul 26;19(29):5502-5512. doi: 10.1039/d3sm00567d.
10
Optimized reaction coordinates for analysis of enhanced sampling.优化反应坐标,用于增强采样分析。
J Chem Phys. 2023 Jul 7;159(1). doi: 10.1063/5.0149207.