• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

宏基因组样本的大规模分类:经典机器学习技术与新型脑启发式高维计算方法的比较分析

Large-scale classification of metagenomic samples: a comparative analysis of classical machine learning techniques vs a novel brain-inspired hyperdimensional computing approach.

作者信息

Joshi Jayadev, Cumbo Fabio, Blankenberg Daniel

机构信息

Center for Computational Life Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, USA.

Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, USA.

出版信息

bioRxiv. 2025 Jul 7:2025.07.06.663394. doi: 10.1101/2025.07.06.663394.

DOI:10.1101/2025.07.06.663394
PMID:40672168
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12265723/
Abstract

Classical machine learning techniques have revolutionized bioinformatics, enabling researchers to extract knowledge from complex biological data. However, these techniques often struggle with high-dimensional data, where the increasing number of features leads to decreased performance, also affecting models accuracy. To address this problem, we explore hyperdimensional computing (HDC), an emerging brain-inspired computational paradigm that leverages high-dimensional vectors and simple arithmetic operations to represent and manipulate complex patterns, as an alternative approach in the context of supervised machine learning. In this work, we present a comprehensive comparative analysis of HDC against established machine learning techniques across a range of classification tasks. As a representative use case, we focus on classifying heterogeneous metagenomic samples based on their quantitative microbial profiles, using publicly available microbiome datasets. Our results demonstrate that HDC achieves comparable, and in some cases, superior classification accuracy to classical methods. Furthermore, our findings highlight the potential of HDC for improved computational efficiency, particularly when dealing with large-scale datasets, suggesting the HDC-based classifier as a promising tool for bioinformatics research, particularly in areas characterized by high-dimensional data. We also offer a Galaxy powered toolset to analyze your own datasets and generate reproducible workflows and adopt these methods in your own research with ease. Our investigation into the application of a HDC-based supervised machine learning technique for classifying microbial profiles in metagenomic samples yielded promising results, demonstrating the potential of this novel computational paradigm to complement and, in some cases, surpass the performances of well established machine learning techniques.

摘要

经典机器学习技术革新了生物信息学,使研究人员能够从复杂的生物数据中提取知识。然而,这些技术在处理高维数据时常常遇到困难,其中特征数量的增加会导致性能下降,也会影响模型的准确性。为了解决这个问题,我们探索了超维计算(HDC),这是一种新兴的受大脑启发的计算范式,它利用高维向量和简单的算术运算来表示和处理复杂模式,作为监督机器学习背景下的一种替代方法。在这项工作中,我们针对一系列分类任务,对HDC与既定的机器学习技术进行了全面的比较分析。作为一个具有代表性的用例,我们使用公开可用的微生物组数据集,专注于根据其定量微生物谱对异源宏基因组样本进行分类。我们的结果表明,HDC在某些情况下实现了与经典方法相当甚至更高的分类准确率。此外,我们的研究结果突出了HDC在提高计算效率方面的潜力,特别是在处理大规模数据集时,这表明基于HDC的分类器是生物信息学研究的一个有前途的工具,尤其是在以高维数据为特征的领域。我们还提供了一个由Galaxy支持的工具集,用于分析您自己的数据集并生成可重复的工作流程,并轻松地在您自己的研究中采用这些方法。我们对基于HDC的监督机器学习技术在宏基因组样本中微生物谱分类的应用研究取得了有希望的结果,证明了这种新颖的计算范式在补充并在某些情况下超越成熟机器学习技术性能方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/69d19fa25b1c/nihpp-2025.07.06.663394v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/fcc445472d30/nihpp-2025.07.06.663394v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/3b7882b38771/nihpp-2025.07.06.663394v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/b2df47a04e6c/nihpp-2025.07.06.663394v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/c83c0ed30a5f/nihpp-2025.07.06.663394v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/3908eff8a3d7/nihpp-2025.07.06.663394v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/b49ec032cc52/nihpp-2025.07.06.663394v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/69d19fa25b1c/nihpp-2025.07.06.663394v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/fcc445472d30/nihpp-2025.07.06.663394v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/3b7882b38771/nihpp-2025.07.06.663394v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/b2df47a04e6c/nihpp-2025.07.06.663394v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/c83c0ed30a5f/nihpp-2025.07.06.663394v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/3908eff8a3d7/nihpp-2025.07.06.663394v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/b49ec032cc52/nihpp-2025.07.06.663394v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5bfa/12265723/69d19fa25b1c/nihpp-2025.07.06.663394v1-f0007.jpg

相似文献

1
Large-scale classification of metagenomic samples: a comparative analysis of classical machine learning techniques vs a novel brain-inspired hyperdimensional computing approach.宏基因组样本的大规模分类:经典机器学习技术与新型脑启发式高维计算方法的比较分析
bioRxiv. 2025 Jul 7:2025.07.06.663394. doi: 10.1101/2025.07.06.663394.
2
Feature selection with vector-symbolic architectures: a case study on microbial profiles of shotgun metagenomic samples of colorectal cancer.基于向量符号架构的特征选择:以结直肠癌鸟枪法宏基因组样本的微生物图谱为例
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf177.
3
Hyperdimensional computing in biomedical sciences: a brief review.生物医学科学中的超维计算:简要综述
PeerJ Comput Sci. 2025 May 13;11:e2885. doi: 10.7717/peerj-cs.2885. eCollection 2025.
4
Short-Term Memory Impairment短期记忆障碍
5
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
6
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
7
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
8
Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验:定性证据综合。
Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.
9
Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤
Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.
10
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

本文引用的文献

1
Hyperdimensional computing in biomedical sciences: a brief review.生物医学科学中的超维计算:简要综述
PeerJ Comput Sci. 2025 May 13;11:e2885. doi: 10.7717/peerj-cs.2885. eCollection 2025.
2
Feature selection with vector-symbolic architectures: a case study on microbial profiles of shotgun metagenomic samples of colorectal cancer.基于向量符号架构的特征选择:以结直肠癌鸟枪法宏基因组样本的微生物图谱为例
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf177.
3
Hyperdimensional computing: A fast, robust, and interpretable paradigm for biological data.
超高维计算:一种用于生物数据的快速、稳健且可解释的范例。
PLoS Comput Biol. 2024 Sep 24;20(9):e1012426. doi: 10.1371/journal.pcbi.1012426. eCollection 2024 Sep.
4
The butyrate-producing and spore-forming bacterial genus as a potential biomarker for neurological disorders.产生丁酸盐和形成孢子的细菌属作为神经疾病的潜在生物标志物。
Gut Microbiome (Camb). 2023 Aug 30;4:e16. doi: 10.1017/gmb.2023.14. eCollection 2023.
5
Editorial: Machine learning and deep learning applications in pathogenic microbiome research.社论:机器学习与深度学习在病原微生物组研究中的应用
Front Cell Infect Microbiol. 2024 Jun 20;14:1429197. doi: 10.3389/fcimb.2024.1429197. eCollection 2024.
6
The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update.Galaxy 平台,用于可访问、可重现和协作的数据分析:2024 年更新。
Nucleic Acids Res. 2024 Jul 5;52(W1):W83-W94. doi: 10.1093/nar/gkae410.
7
Machine learning and deep learning applications in microbiome research.机器学习与深度学习在微生物组研究中的应用。
ISME Commun. 2022 Oct 6;2(1):98. doi: 10.1038/s43705-022-00182-9.
8
Machine learning-based feature selection to search stable microbial biomarkers: application to inflammatory bowel disease.基于机器学习的特征选择搜索稳定的微生物生物标志物:在炎症性肠病中的应用。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad083. Epub 2023 Oct 26.
9
Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action.利用机器学习推进微生物组研究:ML4Microbiome COST行动的关键发现
Front Microbiol. 2023 Sep 25;14:1257002. doi: 10.3389/fmicb.2023.1257002. eCollection 2023.
10
Big Data for a Small World: A Review on Databases and Resources for Studying Microbiomes.小世界中的大数据:微生物群落研究的数据库与资源综述
J Indian Inst Sci. 2023 Apr 5:1-17. doi: 10.1007/s41745-023-00370-z.