• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于大数据贝叶斯分析的自助式梅塔ropolis-黑斯廷斯算法。

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.

作者信息

Liang Faming, Kim Jinsu, Song Qifan

机构信息

Professor, Department of Biostatistics, University of Florida, Gainesville, FL 32611.

Enterprise Solution Specialist, Big Data Analysis Consulting Team, LG CNS, Seoul, Republic of Korea.

出版信息

Technometrics. 2016;58(3):604-318. doi: 10.1080/00401706.2016.1142905. Epub 2016 Jul 8.

DOI:10.1080/00401706.2016.1142905
PMID:29033469
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5637557/
Abstract

Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively.

摘要

马尔可夫链蒙特卡罗(MCMC)方法已被证明是分析复杂结构数据的一种非常强大的工具。然而,其计算密集型的特性,通常每次迭代都需要大量的迭代次数和对整个数据集进行完整扫描,这使得它们无法用于大数据分析。在本文中,我们提出了所谓的自助梅特罗波利斯 - 黑斯廷斯(BMH)算法,该算法提供了一个通用框架,用于说明如何驾驭强大的MCMC方法以用于大数据分析;即通过对数似然的蒙特卡罗平均值来代替全数据对数似然,该对数似然是从多个自助样本并行计算得到的。BMH算法具有易于并行化的结构,并且避免了在迭代中对整个数据集进行重复扫描,因此对于大数据问题是可行的。与流行的分而组合方法相比,BMH通常效率更高,因为它可以渐近地将整个数据信息整合到一次模拟运行中。BMH算法非常灵活。与梅特罗波利斯 - 黑斯廷斯算法一样,它可以作为开发适用于大数据问题的高级MCMC算法的基本构建块。本文通过回火BMH算法对此进行了说明,该算法可以看作是并行回火和BMH算法的结合。BMH还可以分别与可逆跳跃MCMC和模拟退火相结合,用于模型选择和优化。

相似文献

1
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.一种用于大数据贝叶斯分析的自助式梅塔ropolis-黑斯廷斯算法。
Technometrics. 2016;58(3):604-318. doi: 10.1080/00401706.2016.1142905. Epub 2016 Jul 8.
2
A general construction for parallelizing Metropolis-Hastings algorithms.一种并行化 Metropolis-Hastings 算法的通用构造。
Proc Natl Acad Sci U S A. 2014 Dec 9;111(49):17408-13. doi: 10.1073/pnas.1408184111. Epub 2014 Nov 24.
3
A Monte Carlo Metropolis-Hastings algorithm for sampling from distributions with intractable normalizing constants.一种用于从具有难以处理的归一化常数的分布中进行抽样的蒙特卡罗 metropolis-hastings 算法。
Neural Comput. 2013 Aug;25(8):2199-234. doi: 10.1162/NECO_a_00466. Epub 2013 Apr 22.
4
Double-Parallel Monte Carlo for Bayesian Analysis of Big Data.用于大数据贝叶斯分析的双平行蒙特卡罗方法
Stat Comput. 2019 Jan;29(1):23-32. doi: 10.1007/s11222-017-9791-1. Epub 2017 Nov 27.
5
Noise can speed Markov chain Monte Carlo estimation and quantum annealing.噪声可以加速马尔可夫链蒙特卡罗估计和量子退火。
Phys Rev E. 2019 Nov;100(5-1):053309. doi: 10.1103/PhysRevE.100.053309.
6
Searching for efficient Markov chain Monte Carlo proposal kernels.搜索高效的马尔可夫链蒙特卡罗提议核。
Proc Natl Acad Sci U S A. 2013 Nov 26;110(48):19307-12. doi: 10.1073/pnas.1311790110. Epub 2013 Nov 11.
7
An algorithm for Monte Carlo estimation of genotype probabilities on complex pedigrees.一种用于复杂家系中基因型概率蒙特卡罗估计的算法。
Ann Hum Genet. 1994 Oct;58(4):343-57. doi: 10.1111/j.1469-1809.1994.tb00731.x.
8
Two-Stage Metropolis-Hastings for Tall Data.用于高维数据的两阶段 metropolis-Hastings 算法
J Classif. 2018 Apr;35(1):29-51. doi: 10.1007/s00357-018-9248-z. Epub 2018 Mar 16.
9
Fast genomic prediction of breeding values using parallel Markov chain Monte Carlo with convergence diagnosis.利用具有收敛诊断的并行马尔可夫链蒙特卡罗方法快速预测育种值。
BMC Bioinformatics. 2018 Jan 3;19(1):3. doi: 10.1186/s12859-017-2003-3.
10
Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference.用于贝叶斯系统发育推断的并行 metropolis 耦合马尔可夫链蒙特卡罗方法
Bioinformatics. 2004 Feb 12;20(3):407-15. doi: 10.1093/bioinformatics/btg427. Epub 2004 Jan 22.

引用本文的文献

1
Density regression and uncertainty quantification with Bayesian deep noise neural networks.基于贝叶斯深度噪声神经网络的密度回归与不确定性量化
Stat. 2023 Jan-Dec;12(1). doi: 10.1002/sta4.604. Epub 2023 Aug 1.
2
Application of Entropy Ensemble Filter in Neural Network Forecasts of Tropical Pacific Sea Surface Temperatures.熵集合滤波器在热带太平洋海表温度神经网络预测中的应用
Entropy (Basel). 2018 Mar 20;20(3):207. doi: 10.3390/e20030207.
3
Statistical methods and computing for big data.大数据的统计方法与计算
Stat Interface. 2016;9(4):399-414. doi: 10.4310/SII.2016.v9.n4.a1.

本文引用的文献

1
A Monte Carlo Metropolis-Hastings algorithm for sampling from distributions with intractable normalizing constants.一种用于从具有难以处理的归一化常数的分布中进行抽样的蒙特卡罗 metropolis-hastings 算法。
Neural Comput. 2013 Aug;25(8):2199-234. doi: 10.1162/NECO_a_00466. Epub 2013 Apr 22.
2
Optimization by simulated annealing.模拟退火优化。
Science. 1983 May 13;220(4598):671-80. doi: 10.1126/science.220.4598.671.