HDBind：采用超维二进制表示法对分子结构进行编码。

HDBind: encoding of molecular structure with hyperdimensional binary representations.

作者信息

Jones Derek, Zhang Xiaohua, Bennion Brian J, Pinge Sumukh, Xu Weihong, Kang Jaeyoung, Khaleghi Behnam, Moshiri Niema, Allen Jonathan E, Rosing Tajana S

机构信息

Department of Computer Science and Engineering, University of California-San Diego, La Jolla, CA, USA.

Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, USA.

出版信息

Sci Rep. 2024 Nov 23;14(1):29025. doi: 10.1038/s41598-024-80009-w.

DOI:10.1038/s41598-024-80009-w

PMID:39578580

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11584749/

Abstract

Traditional methods for identifying "hit" molecules from a large collection of potential drug-like candidates rely on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug and its protein target. These approaches have a significant limitation in that they require exceptional computing capabilities for even relatively small collections of molecules. Increasingly large and complex state-of-the-art deep learning approaches have gained popularity with the promise to improve the productivity of drug design, notorious for its numerous failures. However, as deep learning models increase in their size and complexity, their acceleration at the hardware level becomes more challenging. Hyperdimensional Computing (HDC) has recently gained attention in the computer hardware community due to its algorithmic simplicity relative to deep learning approaches. The HDC learning paradigm, which represents data with high-dimension binary vectors, allows the use of low-precision binary vector arithmetic to create models of the data that can be learned without the need for the gradient-based optimization required in many conventional machine learning and deep learning methods. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated in a range of application areas (computer vision, bioinformatics, mass spectrometery, remote sensing, edge devices, etc.). To the best of our knowledge, our work is the first to consider HDC for the task of fast and efficient screening of modern drug-like compound libraries. We also propose the first HDC graph-based encoding methods for molecular data, demonstrating consistent and substantial improvement over previous work. We compare our approaches to alternative approaches on the well-studied MoleculeNet dataset and the recently proposed LIT-PCBA dataset derived from high quality PubChem assays. We demonstrate our methods on multiple target hardware platforms, including Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs), showing at least an order of magnitude improvement in energy efficiency versus even our smallest neural network baseline model with a single hidden layer. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools. We make our code publicly available at https://github.com/LLNL/hdbind .

摘要

从大量潜在的类药物候选物中识别“命中”分子的传统方法依赖生物物理理论来计算药物与其蛋白质靶点之间结合相互作用的吉布斯自由能近似值。这些方法有一个显著的局限性，即即使对于相对较小的分子集合，它们也需要卓越的计算能力。规模越来越大且日益复杂的先进深度学习方法因有望提高药物设计的生产率而受到欢迎，而药物设计一直以失败众多而声名狼藉。然而，随着深度学习模型规模和复杂度的增加，其在硬件层面的加速变得更具挑战性。超维计算（HDC）最近在计算机硬件领域受到关注，因为相对于深度学习方法，它的算法更简单。HDC学习范式用高维二进制向量表示数据，允许使用低精度二进制向量算法来创建数据模型，这些模型无需许多传统机器学习和深度学习方法所需的基于梯度的优化就能学习。这种算法简单性使得在硬件上能够实现加速，这一点已在一系列应用领域（计算机视觉、生物信息学、质谱分析、遥感、边缘设备等）得到证明。据我们所知，我们的工作是首次将HDC用于快速高效筛选现代类药物化合物库的任务。我们还提出了第一种基于HDC图的分子数据编码方法，与之前工作相比展现出持续且显著的改进。我们在经过充分研究的MoleculeNet数据集以及最近从高质量PubChem分析中得出的LIT - PCBA数据集上，将我们的方法与其他方法进行比较。我们在包括图形处理单元（GPU）和现场可编程门阵列（FPGA）在内的多个目标硬件平台上展示了我们的方法，结果表明，即使与我们具有单个隐藏层的最小神经网络基线模型相比，能源效率至少提高了一个数量级。因此，我们的工作促使人们进一步研究分子表示学习，以开发超高效的预筛选工具。我们将代码公开在https://github.com/LLNL/hdbind上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31a2/11584749/bcfef2a2ca25/41598_2024_80009_Figa_HTML.jpg

相似文献

HDBind: encoding of molecular structure with hyperdimensional binary representations.HDBind：采用超维二进制表示法对分子结构进行编码。

Sci Rep. 2024 Nov 23;14(1):29025. doi: 10.1038/s41598-024-80009-w.

Hyperdimensional Brain-Inspired Learning for Phoneme Recognition With Large-Scale Inferior Colliculus Neural Activities.基于超维脑启发学习的大规模下丘神经元活动的语音识别

IEEE Trans Biomed Eng. 2024 Nov;71(11):3098-3110. doi: 10.1109/TBME.2024.3408279. Epub 2024 Oct 25.

Optical hyperdimensional soft sensing: speckle-based touch interface and tactile sensor.光学超维软传感：基于散斑的触摸界面和触觉传感器。

Opt Express. 2024 Jan 29;32(3):3209-3220. doi: 10.1364/OE.513802.

Hyperdimensional computing with holographic and adaptive encoder.采用全息与自适应编码器的超维计算

Front Artif Intell. 2024 Apr 9;7:1371988. doi: 10.3389/frai.2024.1371988. eCollection 2024.

MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA：药物发现和机器学习的多保真度高通量筛选基准

J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.

Symbolic Representation and Learning With Hyperdimensional Computing.基于超维计算的符号表示与学习

Front Robot AI. 2020 Jun 9;7:63. doi: 10.3389/frobt.2020.00063. eCollection 2020.

An encoding framework for binarized images using hyperdimensional computing.一种使用超维计算的二值化图像编码框架。

Front Big Data. 2024 Jun 14;7:1371518. doi: 10.3389/fdata.2024.1371518. eCollection 2024.

GrapHD: Graph-Based Hyperdimensional Memorization for Brain-Like Cognitive Learning.GrapHD：基于图的超维记忆实现类脑认知学习

Front Neurosci. 2022 Feb 4;16:757125. doi: 10.3389/fnins.2022.757125. eCollection 2022.

QPoweredCompound2DeNovoDrugPropMax - a novel programmatic tool incorporating deep learning and methods for automated in silico bio-activity discovery for any compound of interest.QPoweredCompound2DeNovoDrugPropMax——一种新颖的编程工具，融合深度学习和方法，可对任何感兴趣的化合物进行自动化的计算机虚拟生物活性发现。

J Biomol Struct Dyn. 2023 Mar;41(5):1790-1797. doi: 10.1080/07391102.2021.2024450. Epub 2022 Jan 10.

Linear Codes for Hyperdimensional Computing.用于超维计算的线性码

Neural Comput. 2024 May 10;36(6):1084-1120. doi: 10.1162/neco_a_01665.

引用本文的文献

Improving drug-induced liver injury prediction using graph neural networks with augmented graph features from molecular optimisation.利用具有分子优化增强图特征的图神经网络改善药物性肝损伤预测。

J Cheminform. 2025 Aug 18;17(1):124. doi: 10.1186/s13321-025-01068-3.

Hyperdimensional computing in biomedical sciences: a brief review.生物医学科学中的超维计算：简要综述

PeerJ Comput Sci. 2025 May 13;11:e2885. doi: 10.7717/peerj-cs.2885. eCollection 2025.

Multi skill project scheduling optimization based on quality transmission and rework network reconstruction.基于质量传递和返工网络重构的多技能项目调度优化

Sci Rep. 2025 Apr 19;15(1):13545. doi: 10.1038/s41598-025-92342-9.

本文引用的文献

The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods.2023 年的 ChEMBL 数据库：一个涵盖多种生物活性数据类型和时间段的药物发现平台。

Nucleic Acids Res. 2024 Jan 5;52(D1):D1180-D1192. doi: 10.1093/nar/gkad1004.

HyperSpec: Ultrafast Mass Spectra Clustering in Hyperdimensional Space.超高维空间中的超快质谱聚类分析

J Proteome Res. 2023 Jun 2;22(6):1639-1648. doi: 10.1021/acs.jproteome.2c00612. Epub 2023 May 11.

Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening.Uni-Dock：GPU 加速对接实现超大规模虚拟筛选。

J Chem Theory Comput. 2023 Jun 13;19(11):3336-3345. doi: 10.1021/acs.jctc.2c01145. Epub 2023 Apr 26.

AI-accelerated protein-ligand docking for SARS-CoV-2 is 100-fold faster with no significant change in detection.AI 加速的 SARS-CoV-2 蛋白-配体对接速度提高了 100 倍，而检测结果没有明显变化。

Sci Rep. 2023 Feb 6;13(1):2105. doi: 10.1038/s41598-023-28785-9.

Achieving software-equivalent accuracy for hyperdimensional computing with ferroelectric-based in-memory computing.实现基于铁电的内存计算的超高维计算的软件等效精度。

Sci Rep. 2022 Nov 10;12(1):19201. doi: 10.1038/s41598-022-23116-w.

PubChem 2023 update.PubChem 2023 更新。

Nucleic Acids Res. 2023 Jan 6;51(D1):D1373-D1380. doi: 10.1093/nar/gkac956.

Accelerators for Classical Molecular Dynamics Simulations of Biomolecules.生物分子经典分子动力学模拟的加速器。

J Chem Theory Comput. 2022 Jul 12;18(7):4047-4069. doi: 10.1021/acs.jctc.1c01214. Epub 2022 Jun 16.

AtomNet PoseRanker: Enriching Ligand Pose Quality for Dynamic Proteins in Virtual High-Throughput Screens.AtomNet PoseRanker：在虚拟高通量筛选中丰富动态蛋白质中配体位点的质量。

J Chem Inf Model. 2022 Mar 14;62(5):1178-1189. doi: 10.1021/acs.jcim.1c01250. Epub 2022 Mar 2.

InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein-Ligand Interaction Predictions.InteractionGraphNet：一种新颖高效的深度图表示学习框架，用于准确预测蛋白质-配体相互作用。

J Med Chem. 2021 Dec 23;64(24):18209-18232. doi: 10.1021/acs.jmedchem.1c01830. Epub 2021 Dec 8.

High-Throughput Virtual Screening and Validation of a SARS-CoV-2 Main Protease Noncovalent Inhibitor.高通量虚拟筛选和 SARS-CoV-2 主蛋白酶非共价抑制剂的验证。

J Chem Inf Model. 2022 Jan 10;62(1):116-128. doi: 10.1021/acs.jcim.1c00851. Epub 2021 Nov 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

HDBind：采用超维二进制表示法对分子结构进行编码。

HDBind: encoding of molecular structure with hyperdimensional binary representations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献