• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ZMPY3D:通过矢量化三维泽尼克矩和基于Python的GPU集成加速蛋白质结构体积分析

ZMPY3D: accelerating protein structure volume analysis through vectorized 3D Zernike moments and Python-based GPU integration.

作者信息

Lai Jhih-Siang, Burley Stephen K, Duarte Jose M

机构信息

Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, United States.

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States.

出版信息

Bioinform Adv. 2024 Jul 25;4(1):vbae111. doi: 10.1093/bioadv/vbae111. eCollection 2024.

DOI:10.1093/bioadv/vbae111
PMID:39100546
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11297494/
Abstract

MOTIVATION

Volumetric 3D object analyses are being applied in research fields such as structural bioinformatics, biophysics, and structural biology, with potential integration of artificial intelligence/machine learning (AI/ML) techniques. One such method, 3D Zernike moments, has proven valuable in analyzing protein structures (e.g., protein fold classification, protein-protein interaction analysis, and molecular dynamics simulations). Their compactness and efficiency make them amenable to large-scale analyses. Established methods for deriving 3D Zernike moments, however, can be inefficient, particularly when higher order terms are required, hindering broader applications. As the volume of experimental and computationally-predicted protein structure information continues to increase, structural biology has become a "big data" science requiring more efficient analysis tools.

RESULTS

This application note presents a Python-based software package, ZMPY3D, to accelerate computation of 3D Zernike moments by vectorizing the mathematical formulae and using graphical processing units (GPUs). The package offers popular GPU-supported libraries such as CuPy and TensorFlow together with NumPy implementations, aiming to improve computational efficiency, adaptability, and flexibility in future algorithm development. The ZMPY3D package can be installed PyPI, and the source code is available from GitHub. Volumetric-based protein 3D structural similarity scores and transform matrix of superposition functionalities have both been implemented, creating a powerful computational tool that will allow the research community to amalgamate 3D Zernike moments with existing AI/ML tools, to advance research and education in protein structure bioinformatics.

AVAILABILITY AND IMPLEMENTATION

ZMPY3D, implemented in Python, is available on GitHub (https://github.com/tawssie/ZMPY3D) and PyPI, released under the GPL License.

摘要

动机

体积三维物体分析正应用于结构生物信息学、生物物理学和结构生物学等研究领域,并有可能集成人工智能/机器学习(AI/ML)技术。一种这样的方法,即三维泽尼克矩,已被证明在分析蛋白质结构方面很有价值(例如,蛋白质折叠分类、蛋白质-蛋白质相互作用分析和分子动力学模拟)。它们的紧凑性和效率使其适用于大规模分析。然而,现有的推导三维泽尼克矩的方法可能效率低下,特别是在需要高阶项时,这阻碍了更广泛的应用。随着实验和计算预测的蛋白质结构信息的数量不断增加,结构生物学已成为一门“大数据”科学,需要更高效的分析工具。

结果

本应用笔记介绍了一个基于Python的软件包ZMPY3D,通过将数学公式向量化并使用图形处理单元(GPU)来加速三维泽尼克矩的计算。该软件包提供了流行的GPU支持库,如CuPy和TensorFlow以及NumPy实现,旨在提高计算效率、适应性和未来算法开发的灵活性。ZMPY3D软件包可以从PyPI安装,源代码可从GitHub获得。基于体积的蛋白质三维结构相似性分数和叠加功能的变换矩阵都已实现,创建了一个强大的计算工具,将使研究界能够将三维泽尼克矩与现有的AI/ML工具结合起来,推动蛋白质结构生物信息学的研究和教育。

可用性和实现

用Python实现的ZMPY3D可在GitHub(https://github.com/tawssie/ZMPY3D)和PyPI上获得,根据GPL许可证发布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8272/11297494/7a0e1b4c9576/vbae111f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8272/11297494/7a0e1b4c9576/vbae111f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8272/11297494/7a0e1b4c9576/vbae111f1.jpg

相似文献

1
ZMPY3D: accelerating protein structure volume analysis through vectorized 3D Zernike moments and Python-based GPU integration.ZMPY3D:通过矢量化三维泽尼克矩和基于Python的GPU集成加速蛋白质结构体积分析
Bioinform Adv. 2024 Jul 25;4(1):vbae111. doi: 10.1093/bioadv/vbae111. eCollection 2024.
2
Structural Outlier Detection and Zernike-Canterakis Moments for Molecular Surface Meshes-Fast Implementation in Python.用于分子表面网格的结构异常检测和泽尼克-坎特拉基斯矩——Python 中的快速实现
Molecules. 2023 Dec 21;29(1):52. doi: 10.3390/molecules29010052.
3
pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods.pyComBat,一个使用经验贝叶斯方法进行高通量分子数据批次效应校正的 Python 工具。
BMC Bioinformatics. 2023 Dec 7;24(1):459. doi: 10.1186/s12859-023-05578-5.
4
Efficient population-scale variant analysis and prioritization with VAPr.利用 VAPr 进行高效的群体规模变异分析和优先级排序。
Bioinformatics. 2018 Aug 15;34(16):2843-2845. doi: 10.1093/bioinformatics/bty192.
5
GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.GPU-FS-kNN:一种使用 GPU 实现快速可扩展 kNN 计算的软件工具。
PLoS One. 2012;7(8):e44000. doi: 10.1371/journal.pone.0044000. Epub 2012 Aug 28.
6
LOCAN: a python library for analyzing single-molecule localization microscopy data.LOCAN:一个用于分析单分子定位显微镜数据的 Python 库。
Bioinformatics. 2022 Apr 28;38(9):2670-2672. doi: 10.1093/bioinformatics/btac160.
7
TrajPy: empowering feature engineering for trajectory analysis across domains.TrajPy:助力跨领域轨迹分析的特征工程
Bioinform Adv. 2024 Feb 23;4(1):vbae026. doi: 10.1093/bioadv/vbae026. eCollection 2024.
8
Poincaré and SimBio: a versatile and extensible Python ecosystem for modeling systems.庞加莱与 SimBio:用于系统建模的多功能可扩展 Python 生态系统。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae465.
9
NeuroPycon: An open-source python toolbox for fast multi-modal and reproducible brain connectivity pipelines.NeuroPycon:一个开源的 Python 工具包,用于快速进行多模态和可重复的脑连接管道。
Neuroimage. 2020 Oct 1;219:117020. doi: 10.1016/j.neuroimage.2020.117020. Epub 2020 Jun 6.
10
PHi-C2: interpreting Hi-C data as the dynamic 3D genome state.PHi-C2:将 Hi-C 数据解释为动态的 3D 基因组状态。
Bioinformatics. 2022 Oct 31;38(21):4984-4986. doi: 10.1093/bioinformatics/btac613.

本文引用的文献

1
Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions.从冷冻电镜密度图重建蛋白质结构的深度学习:最新进展和未来方向。
Curr Opin Struct Biol. 2023 Apr;79:102536. doi: 10.1016/j.sbi.2023.102536. Epub 2023 Feb 9.
2
RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning.RCSB 蛋白质数据库(RCSB.org):提供实验测定的 PDB 结构以及来自人工智能/机器学习的 100 万个蛋白质计算结构模型。
Nucleic Acids Res. 2023 Jan 6;51(D1):D488-D508. doi: 10.1093/nar/gkac1077.
3
Capturing the geometry, function, and evolution of enzymes with 3D templates.
用 3D 模板捕捉酶的几何形状、功能和进化。
Protein Sci. 2022 Jul;31(7):e4363. doi: 10.1002/pro.4363.
4
Real-time structure search and structure classification for AlphaFold protein models.实时的 AlphaFold 蛋白质模型结构搜索和结构分类。
Commun Biol. 2022 Apr 5;5(1):316. doi: 10.1038/s42003-022-03261-8.
5
Binding site identification of G protein-coupled receptors through a 3D Zernike polynomials-based method: application to C. elegans olfactory receptors.通过基于 3D Zernike 多项式的方法鉴定 G 蛋白偶联受体的结合位点:在秀丽隐杆线虫嗅觉受体中的应用。
J Comput Aided Mol Des. 2022 Jan;36(1):11-24. doi: 10.1007/s10822-021-00434-1. Epub 2022 Jan 1.
6
Accurate prediction of protein structures and interactions using a three-track neural network.使用三轨神经网络准确预测蛋白质结构和相互作用。
Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.
7
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
8
ZEAL: protein structure alignment based on shape similarity.ZEAL:基于形状相似性的蛋白质结构比对。
Bioinformatics. 2021 Sep 29;37(18):2874-2881. doi: 10.1093/bioinformatics/btab205.
9
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
10
Real time structural search of the Protein Data Bank.实时蛋白质数据库结构搜索。
PLoS Comput Biol. 2020 Jul 8;16(7):e1007970. doi: 10.1371/journal.pcbi.1007970. eCollection 2020 Jul.