• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过 GPU 移植加速 HMSC-HPC 联合物种分布模型

Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting.

机构信息

Department of Biological and Environmental Science, Faculty of Mathematics and Science, University of Jyväskylä, Jyväskylä, Finland.

Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.

出版信息

PLoS Comput Biol. 2024 Sep 3;20(9):e1011914. doi: 10.1371/journal.pcbi.1011914. eCollection 2024 Sep.

DOI:10.1371/journal.pcbi.1011914
PMID:39226337
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11398642/
Abstract

Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally demanding and time-consuming. Recent studies have introduced new statistical and machine learning techniques to provide more scalable fitting algorithms, but extending these to complex JSDM structures that account for spatial dependencies or multi-level sampling designs remains challenging. In this study, we aim to enhance JSDM scalability by leveraging high-performance computing (HPC) resources for an existing fitting method. Our work focuses on the Hmsc R-package, a widely used JSDM framework that supports the integration of various dataset types into a single comprehensive model. We developed a GPU-compatible implementation of its model-fitting algorithm using Python and the TensorFlow library. Despite these changes, our enhanced framework retains the original user interface of the Hmsc R-package. We evaluated the performance of the proposed implementation across various model configurations and dataset sizes. Our results show a significant increase in model fitting speed for most models compared to the baseline Hmsc R-package. For the largest datasets, we achieved speed-ups of over 1000 times, demonstrating the substantial potential of GPU porting for previously CPU-bound JSDM software. This advancement opens promising opportunities for better utilizing the rapidly accumulating new biodiversity data resources for inference and prediction.

摘要

联合物种分布模型 (JSDM) 是一种广泛使用的统计方法,用于分析群落中所有物种的综合模式,将经验数据与生态理论联系起来,并增强对整个群落的预测任务。然而,拟合大型数据集的 JSDM 通常需要大量的计算资源和时间。最近的研究引入了新的统计和机器学习技术,以提供更具可扩展性的拟合算法,但将这些算法扩展到考虑空间依赖性或多层次抽样设计的复杂 JSDM 结构仍然具有挑战性。在这项研究中,我们旨在通过利用高性能计算 (HPC) 资源来增强 JSDM 的可扩展性,针对现有的拟合方法。我们的工作重点是 Hmsc R 包,这是一个广泛使用的 JSDM 框架,支持将各种数据集类型集成到单个综合模型中。我们使用 Python 和 TensorFlow 库为其模型拟合算法开发了一个 GPU 兼容的实现。尽管进行了这些更改,但我们增强的框架保留了 Hmsc R 包的原始用户界面。我们针对各种模型配置和数据集大小评估了所提出的实现的性能。我们的结果表明,与基线 Hmsc R 包相比,大多数模型的模型拟合速度都有显著提高。对于最大的数据集,我们实现了超过 1000 倍的加速,这表明 GPU 移植对于以前受 CPU 限制的 JSDM 软件具有巨大的潜力。这一进展为更好地利用快速积累的新生物多样性数据资源进行推断和预测提供了有希望的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/076f/11398642/9241649fd7d4/pcbi.1011914.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/076f/11398642/27a0dbbdf6a1/pcbi.1011914.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/076f/11398642/9241649fd7d4/pcbi.1011914.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/076f/11398642/27a0dbbdf6a1/pcbi.1011914.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/076f/11398642/9241649fd7d4/pcbi.1011914.g002.jpg

相似文献

1
Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting.通过 GPU 移植加速 HMSC-HPC 联合物种分布模型
PLoS Comput Biol. 2024 Sep 3;20(9):e1011914. doi: 10.1371/journal.pcbi.1011914. eCollection 2024 Sep.
2
Joint species distribution modelling with the r-package Hmsc.使用R包Hmsc进行联合物种分布建模。
Methods Ecol Evol. 2020 Mar;11(3):442-447. doi: 10.1111/2041-210X.13345. Epub 2020 Jan 23.
3
GPU accelerated biochemical network simulation.GPU 加速的生化网络模拟。
Bioinformatics. 2011 Mar 15;27(6):874-6. doi: 10.1093/bioinformatics/btr015. Epub 2011 Jan 11.
4
Inference of dynamic spatial GRN models with multi-GPU evolutionary computation.使用多 GPU 进化计算推断动态空间 GRN 模型。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab104.
5
A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures.一种针对可扩展GPU架构进行优化的基于非体素的剂量卷积/叠加算法。
Med Phys. 2014 Oct;41(10):101711. doi: 10.1118/1.4895822.
6
GPU-FS-kNN: a software tool for fast and scalable kNN computation using GPUs.GPU-FS-kNN:一种使用 GPU 实现快速可扩展 kNN 计算的软件工具。
PLoS One. 2012;7(8):e44000. doi: 10.1371/journal.pone.0044000. Epub 2012 Aug 28.
7
Parallel beamlet dose calculation via beamlet contexts in a distributed multi-GPU framework.基于分布式多 GPU 框架中的束流子区域进行平行束流子剂量计算。
Med Phys. 2019 Aug;46(8):3719-3733. doi: 10.1002/mp.13651. Epub 2019 Jun 30.
8
GAMUT: GPU accelerated microRNA analysis to uncover target genes through CUDA-miRanda.GAMUT:通过CUDA-miRanda实现GPU加速的微小RNA分析以揭示靶基因
BMC Med Genomics. 2014;7 Suppl 1(Suppl 1):S9. doi: 10.1186/1755-8794-7-S1-S9. Epub 2014 May 8.
9
Multi-GPU implementation of a VMAT treatment plan optimization algorithm.容积调强放疗(VMAT)治疗计划优化算法的多图形处理器(Multi-GPU)实现
Med Phys. 2015 Jun;42(6):2841-52. doi: 10.1118/1.4919742.
10
Maboss for HPC environments: implementations of the continuous time Boolean model simulator for large CPU clusters and GPU accelerators.用于高性能计算环境的 Maboss:用于大型 CPU 集群和 GPU 加速器的连续时间布尔模型模拟器的实现。
BMC Bioinformatics. 2024 May 24;25(1):199. doi: 10.1186/s12859-024-05815-5.

本文引用的文献

1
Novel community data in ecology-properties and prospects.生态学中的新型社区数据——特性与展望。
Trends Ecol Evol. 2024 Mar;39(3):280-293. doi: 10.1016/j.tree.2023.09.017. Epub 2023 Nov 8.
2
Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays.广义矩阵分解:用于将广义线性潜在变量模型拟合到大型数据阵列的高效算法。
J Mach Learn Res. 2022 Nov;23.
3
Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains.通过分区域上的网格化高斯过程实现的高度可扩展贝叶斯地理统计建模
J Am Stat Assoc. 2022;117(538):969-982. doi: 10.1080/01621459.2020.1833889. Epub 2020 Nov 24.
4
Joint species distribution modelling with the r-package Hmsc.使用R包Hmsc进行联合物种分布建模。
Methods Ecol Evol. 2020 Mar;11(3):442-447. doi: 10.1111/2041-210X.13345. Epub 2020 Jan 23.
5
Computationally efficient joint species distribution modeling of big spatial data.大空间数据的计算效率高的联合物种分布模型。
Ecology. 2020 Feb;101(2):e02929. doi: 10.1002/ecy.2929. Epub 2019 Dec 20.
6
Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes.用于贝叶斯最近邻高斯过程的高效算法。
J Comput Graph Stat. 2019;28(2):401-414. doi: 10.1080/10618600.2018.1537924. Epub 2019 Apr 1.
7
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.用于大型地理统计数据集的分层最近邻高斯过程模型。
J Am Stat Assoc. 2016;111(514):800-812. doi: 10.1080/01621459.2015.1044091. Epub 2016 Aug 18.
8
How to make more out of community data? A conceptual framework and its implementation as models and software.如何从社区数据中获得更多信息?一个概念框架及其作为模型和软件的实现。
Ecol Lett. 2017 May;20(5):561-576. doi: 10.1111/ele.12757. Epub 2017 Mar 20.
9
So Many Variables: Joint Modeling in Community Ecology.如此多的变量:群落生态学中的联合建模。
Trends Ecol Evol. 2015 Dec;30(12):766-779. doi: 10.1016/j.tree.2015.09.007. Epub 2015 Oct 28.
10
Identifying biotic interactions which drive the spatial distribution of a mosquito community.识别驱动蚊虫群落空间分布的生物相互作用。
Parasit Vectors. 2015 Jul 14;8:367. doi: 10.1186/s13071-015-0915-1.