Suppr超能文献

用于大数据贝叶斯回归的空间多元树

Spatial Multivariate Trees for Big Data Bayesian Regression.

作者信息

Peruzzi Michele, Dunson David B

机构信息

Department of Statistical Science, Duke University, Durham, NC 27708-0251, USA.

出版信息

J Mach Learn Res. 2022;23.

Abstract

High resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to methods for large scale data that allow the description of complex relationships between several outcomes recorded at high resolutions by different sensors. Our Bayesian multivariate regression models based on spatial multivariate trees (SpamTrees) achieve scalability via conditional independence assumptions on latent random effects following a treed directed acyclic graph. Information-theoretic arguments and considerations on computational efficiency guide the construction of the tree and the related efficient sampling algorithms in imbalanced multivariate settings. In addition to simulated data examples, we illustrate SpamTrees using a large climate data set which combines satellite data with land-based station data. Software and source code are available on CRAN at https://CRAN.R-project.org/package=spamtree.

摘要

高分辨率地理空间数据具有挑战性,因为基于高斯过程的标准地理统计模型已知无法扩展到大数据规模。虽然在可更高效计算的方法方面已取得进展,但对于大规模数据的方法关注较少,这些方法要能描述由不同传感器在高分辨率下记录的多个结果之间的复杂关系。我们基于空间多元树(SpamTrees)的贝叶斯多元回归模型通过在遵循树形有向无环图的潜在随机效应上的条件独立性假设来实现可扩展性。信息论观点和对计算效率的考虑指导了树的构建以及在不平衡多元设置中的相关高效采样算法。除了模拟数据示例外,我们还使用一个将卫星数据与地面站数据相结合的大型气候数据集来说明SpamTrees。软件和源代码可在CRAN上获取,网址为https://CRAN.R-project.org/package=spamtree。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de3e/9311452/1b9a4a216b4f/nihms-1815550-f0006.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验