Suppr超能文献

基于物理学启发的并行加速单粒子追踪

Physics-Inspired Single-Particle Tracking Accelerated with Parallelism.

作者信息

Xu Lance W Q, Pressé Steve

机构信息

Center for Biological Physics, Arizona State University, Tempe, AZ, USA.

Department of Physics, Arizona State University, Tempe, AZ, USA.

出版信息

bioRxiv. 2025 Jun 3:2025.05.30.657103. doi: 10.1101/2025.05.30.657103.

Abstract

Data modeling tools face trade-offs between accuracy, computational efficiency, data efficiency, and model flexibility. Physics-inspired, rigorous likelihood-based approaches, while offering high accuracy and data efficiency, remain limited in practice due to high computational cost, particularly when applied to larger-scale problems. This general limitation is further compounded by reliance on traditionally single-threaded iterative sampling or optimization procedures, which are difficult to scale. Although prior efforts have attempted to parallelize expensive likelihood-based approaches by partitioning data or running multiple sampling replicas in parallel, such strategies fail for algorithms requiring efficient communication between processes. Here, we introduce a fundamentally different strategy: we exploit the parallelism inherent in both likelihood evaluation and posterior sampling, operating on a single shared dataset. Our framework supports frequent yet lightweight inter-thread and inter-processor communication, making it well-suited for modern parallel architectures. Using diffraction-limited single-particle fluorescence tracking as a case study, this approach achieves up to a 50-fold speedup on a single mid-range GPU compared to conventional single-threaded CPU implementations, demonstrating a scalable and efficient solution for high-performance likelihood-based inference.

摘要

数据建模工具在准确性、计算效率、数据效率和模型灵活性之间面临权衡。受物理启发的、基于严格似然性的方法虽然具有高精度和数据效率,但由于计算成本高,在实践中仍然受到限制,特别是在应用于大规模问题时。这种普遍的限制因依赖传统的单线程迭代采样或优化程序而进一步加剧,这些程序难以扩展。尽管先前的努力试图通过对数据进行分区或并行运行多个采样副本,来并行化基于似然性的昂贵方法,但对于需要进程间高效通信的算法,这种策略并不适用。在这里,我们引入了一种根本不同的策略:我们利用似然性评估和后验采样中固有的并行性,在单个共享数据集上进行操作。我们的框架支持频繁但轻量级的线程间和处理器间通信,使其非常适合现代并行架构。以衍射极限单粒子荧光跟踪为例,与传统的单线程CPU实现相比,这种方法在单个中端GPU上实现了高达50倍的加速,展示了一种用于基于似然性的高性能推理的可扩展且高效的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd96/12157436/492085ce19f7/nihpp-2025.05.30.657103v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验