• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于科学数据时空均匀性的距离直方图计算

Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.

作者信息

Kumar Anand, Grupcev Vladimir, Yuan Yongke, Tu Yi-Cheng, Shen Gang

机构信息

Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave., ENB 118, Tampa, FL 33620, USA.

School of Economics and Management, Beijing University of Technology, 100 Pingleyuan, Chaoyang District, Beijing 100124, China.

出版信息

Adv Database Technol. 2012. doi: 10.1145/2247596.2247631.

DOI:10.1145/2247596.2247631
PMID:24378961
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3873006/
Abstract

Large data generated by scientific applications imposes challenges in storage and efficient query processing. Many queries against scientific data are analytical in nature and require super-linear computation time using straightforward methods. Spatial distance histogram (SDH) is one of the basic queries to analyze the molecular simulation (MS) data, and it takes quadratic time to compute using brute-force approach. Often, an SDH query is executed continuously to analyze the simulation system over a period of time. This adds to the total time required to compute SDH. In this paper, we propose an approximate algorithm to compute SDH efficiently over consecutive time periods. In our approach, data is organized into a Quad-tree based data structure. The spatial locality of the particles (at given time) in each node of the tree is acquired to determine the particle distribution. Similarly, the temporal locality of particles (between consecutive time periods) in each node is also acquired. The spatial distribution and temporal locality are utilized to compute the approximate SDH at every time instant. The performance is boosted by storing and updating the spatial distribution information over time. The efficiency and accuracy of the proposed algorithm is supported by mathematical analysis and results of extensive experiments using biological data generated from real MS studies.

摘要

科学应用程序生成的大数据给存储和高效查询处理带来了挑战。许多针对科学数据的查询本质上是分析性的,使用直接方法需要超线性计算时间。空间距离直方图(SDH)是分析分子模拟(MS)数据的基本查询之一,使用暴力方法计算需要二次时间。通常,会连续执行SDH查询以在一段时间内分析模拟系统。这增加了计算SDH所需的总时间。在本文中,我们提出了一种近似算法,用于在连续时间段内高效计算SDH。在我们的方法中,数据被组织成基于四叉树的数据结构。获取树中每个节点(在给定时间)的粒子空间局部性以确定粒子分布。类似地,也获取每个节点中粒子(在连续时间段之间)的时间局部性。利用空间分布和时间局部性在每个时刻计算近似SDH。通过随时间存储和更新空间分布信息来提高性能。所提算法的效率和准确性得到了数学分析以及使用真实MS研究生成的生物学数据进行的大量实验结果的支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/244eb2357d04/nihms388543f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/5dae7193b560/nihms388543f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/226cb8c2e624/nihms388543f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/4def10950043/nihms388543f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/bb43276f068d/nihms388543f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/f809acce4497/nihms388543f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/54197e67cbc4/nihms388543f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/212b0419a5e8/nihms388543f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/2bcf0b0886cf/nihms388543f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/a238a9980e3a/nihms388543f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/7bc2c7de1e92/nihms388543f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/25e95b0c9685/nihms388543f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/244eb2357d04/nihms388543f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/5dae7193b560/nihms388543f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/226cb8c2e624/nihms388543f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/4def10950043/nihms388543f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/bb43276f068d/nihms388543f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/f809acce4497/nihms388543f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/54197e67cbc4/nihms388543f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/212b0419a5e8/nihms388543f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/2bcf0b0886cf/nihms388543f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/a238a9980e3a/nihms388543f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/7bc2c7de1e92/nihms388543f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/25e95b0c9685/nihms388543f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9461/3873006/244eb2357d04/nihms388543f11.jpg

相似文献

1
Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.基于科学数据时空均匀性的距离直方图计算
Adv Database Technol. 2012. doi: 10.1145/2247596.2247631.
2
Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.动态计算大型科学数据集的空间距离直方图
IEEE Trans Knowl Data Eng. 2014 Oct;26(10):2410-2424. doi: 10.1109/TKDE.2014.2298015.
3
Efficient SDH Computation In Molecular Simulations Data.分子模拟数据中的高效SDH计算
ACM BCB. 2012 Oct;2012:527-529. doi: 10.1145/2382936.2383010.
4
Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.具有精度保证的空间距离直方图计算近似算法。
IEEE Trans Knowl Data Eng. 2012 Sep 1;25(9):1982-1996. doi: 10.1109/TKDE.2012.149.
5
Performance analysis of a dual-tree algorithm for computing spatial distance histograms.用于计算空间距离直方图的双树算法性能分析
VLDB J. 2011 Aug 1;20(4):471-494. doi: 10.1007/s00778-010-0205-7.
6
Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data.迈向构建用于大规模医学影像数据的高性能空间查询系统
Proc ACM SIGSPATIAL Int Conf Adv Inf. 2012 Nov 6;2012:309-318. doi: 10.1145/2424321.2424361.
7
DCMS: A data analytics and management system for molecular simulation.DCMS:一种用于分子模拟的数据分析与管理系统。
J Big Data. 2015;2(1):9. doi: 10.1186/s40537-014-0009-5. Epub 2014 Nov 26.
8
An efficient Earth Mover's Distance algorithm for robust histogram comparison.一种用于稳健直方图比较的高效推土机距离算法。
IEEE Trans Pattern Anal Mach Intell. 2007 May;29(5):840-53. doi: 10.1109/TPAMI.2007.1058.
9
An environmental monitoring system for managing spatiotemporal sensor data over sensor networks.用于管理传感器网络中时空传感器数据的环境监测系统。
Sensors (Basel). 2012;12(4):3997-4015. doi: 10.3390/s120403997. Epub 2012 Mar 27.
10
Fast and exact fixed-radius neighbor search based on sorting.基于排序的快速精确固定半径邻域搜索。
PeerJ Comput Sci. 2024 Mar 29;10:e1929. doi: 10.7717/peerj-cs.1929. eCollection 2024.

引用本文的文献

1
Concurrent query processing in a GPU-based database system.基于 GPU 的数据库系统中的并发查询处理。
PLoS One. 2019 Apr 16;14(4):e0214720. doi: 10.1371/journal.pone.0214720. eCollection 2019.
2
Efficient SDH Computation In Molecular Simulations Data.分子模拟数据中的高效SDH计算
ACM BCB. 2012 Oct;2012:527-529. doi: 10.1145/2382936.2383010.
3
Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.动态计算大型科学数据集的空间距离直方图
IEEE Trans Knowl Data Eng. 2014 Oct;26(10):2410-2424. doi: 10.1109/TKDE.2014.2298015.
4
Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.具有精度保证的空间距离直方图计算近似算法。
IEEE Trans Knowl Data Eng. 2012 Sep 1;25(9):1982-1996. doi: 10.1109/TKDE.2012.149.

本文引用的文献

1
GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.GROMACS 4:高效、负载均衡和可扩展的分子模拟算法。
J Chem Theory Comput. 2008 Mar;4(3):435-47. doi: 10.1021/ct700301q.
2
Performance analysis of a dual-tree algorithm for computing spatial distance histograms.用于计算空间距离直方图的双树算法性能分析
VLDB J. 2011 Aug 1;20(4):471-494. doi: 10.1007/s00778-010-0205-7.