• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于计算空间距离直方图的双树算法性能分析

Performance analysis of a dual-tree algorithm for computing spatial distance histograms.

作者信息

Chen Shaoping, Tu Yi-Cheng, Xia Yuni

机构信息

Department of Mathematics, Wuhan University of Technology, 122 Luosi Road, 430070 Wuhan, Hubei, People's Republic of China

出版信息

VLDB J. 2011 Aug 1;20(4):471-494. doi: 10.1007/s00778-010-0205-7.

DOI:10.1007/s00778-010-0205-7
PMID:21804753
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3145372/
Abstract

Many scientific and engineering fields produce large volume of spatiotemporal data. The storage, retrieval, and analysis of such data impose great challenges to database systems design. Analysis of scientific spatiotemporal data often involves computing functions of all point-to-point interactions. One such analytics, the Spatial Distance Histogram (SDH), is of vital importance to scientific discovery. Recently, algorithms for efficient SDH processing in large-scale scientific databases have been proposed. These algorithms adopt a recursive tree-traversing strategy to process point-to-point distances in the visited tree nodes in batches, thus require less time when compared to the brute-force approach where all pairwise distances have to be computed. Despite the promising experimental results, the complexity of such algorithms has not been thoroughly studied. In this paper, we present an analysis of such algorithms based on a geometric modeling approach. The main technique is to transform the analysis of point counts into a problem of quantifying the area of regions where pairwise distances can be processed in batches by the algorithm. From the analysis, we conclude that the number of pairwise distances that are left to be processed decreases exponentially with more levels of the tree visited. This leads to the proof of a time complexity lower than the quadratic time needed for a brute-force algorithm and builds the foundation for a constant-time approximate algorithm. Our model is also general in that it works for a wide range of point spatial distributions, histogram types, and space-partitioning options in building the tree.

摘要

许多科学和工程领域都会产生大量的时空数据。对此类数据的存储、检索和分析给数据库系统设计带来了巨大挑战。对科学时空数据的分析通常涉及计算所有点对点交互的函数。其中一种分析方法,即空间距离直方图(SDH),对科学发现至关重要。最近,已经提出了在大规模科学数据库中高效处理SDH的算法。这些算法采用递归树遍历策略,批量处理访问的树节点中的点对点距离,因此与必须计算所有成对距离的暴力方法相比,所需时间更少。尽管实验结果很有前景,但此类算法的复杂性尚未得到深入研究。在本文中,我们基于几何建模方法对此类算法进行了分析。主要技术是将点数分析转化为一个问题,即量化算法可以批量处理成对距离的区域面积。通过分析,我们得出结论,随着访问树的层数增加,待处理的成对距离数量呈指数下降。这导致了一个时间复杂度低于暴力算法所需二次时间的证明,并为常数时间近似算法奠定了基础。我们的模型也具有通用性,因为它适用于构建树时的广泛点空间分布、直方图类型和空间划分选项。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/ce79b9d72a0f/nihms-264574-f0021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/f69f75c9b340/nihms-264574-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/f2396c07afb9/nihms-264574-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/d34f973e82d0/nihms-264574-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/586fd10200e3/nihms-264574-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/278115d294f6/nihms-264574-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/755a5718a812/nihms-264574-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/16a84ff6b36c/nihms-264574-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/5c25f3f90af6/nihms-264574-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/2d9db596df7d/nihms-264574-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/a573b19ced15/nihms-264574-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/c325724016aa/nihms-264574-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/c7160c9e86fd/nihms-264574-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/b4763988ead1/nihms-264574-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/65aacddc1de9/nihms-264574-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/cb21d4696faa/nihms-264574-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/037bd7c64624/nihms-264574-f0016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/e70b34290b95/nihms-264574-f0017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/491fa15bfd25/nihms-264574-f0018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/566925cf89c0/nihms-264574-f0019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/58e4cad4327b/nihms-264574-f0020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/ce79b9d72a0f/nihms-264574-f0021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/f69f75c9b340/nihms-264574-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/f2396c07afb9/nihms-264574-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/d34f973e82d0/nihms-264574-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/586fd10200e3/nihms-264574-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/278115d294f6/nihms-264574-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/755a5718a812/nihms-264574-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/16a84ff6b36c/nihms-264574-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/5c25f3f90af6/nihms-264574-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/2d9db596df7d/nihms-264574-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/a573b19ced15/nihms-264574-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/c325724016aa/nihms-264574-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/c7160c9e86fd/nihms-264574-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/b4763988ead1/nihms-264574-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/65aacddc1de9/nihms-264574-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/cb21d4696faa/nihms-264574-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/037bd7c64624/nihms-264574-f0016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/e70b34290b95/nihms-264574-f0017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/491fa15bfd25/nihms-264574-f0018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/566925cf89c0/nihms-264574-f0019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/58e4cad4327b/nihms-264574-f0020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec32/3145372/ce79b9d72a0f/nihms-264574-f0021.jpg

相似文献

1
Performance analysis of a dual-tree algorithm for computing spatial distance histograms.用于计算空间距离直方图的双树算法性能分析
VLDB J. 2011 Aug 1;20(4):471-494. doi: 10.1007/s00778-010-0205-7.
2
Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.具有精度保证的空间距离直方图计算近似算法。
IEEE Trans Knowl Data Eng. 2012 Sep 1;25(9):1982-1996. doi: 10.1109/TKDE.2012.149.
3
Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.动态计算大型科学数据集的空间距离直方图
IEEE Trans Knowl Data Eng. 2014 Oct;26(10):2410-2424. doi: 10.1109/TKDE.2014.2298015.
4
Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.基于科学数据时空均匀性的距离直方图计算
Adv Database Technol. 2012. doi: 10.1145/2247596.2247631.
5
Efficient SDH Computation In Molecular Simulations Data.分子模拟数据中的高效SDH计算
ACM BCB. 2012 Oct;2012:527-529. doi: 10.1145/2382936.2383010.
6
Efficient local statistical analysis via integral histograms with discrete wavelet transform.通过积分直方图和离散小波变换实现高效的局部统计分析。
IEEE Trans Vis Comput Graph. 2013 Dec;19(12):2693-702. doi: 10.1109/TVCG.2013.152.
7
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
8
An efficient Earth Mover's Distance algorithm for robust histogram comparison.一种用于稳健直方图比较的高效推土机距离算法。
IEEE Trans Pattern Anal Mach Intell. 2007 May;29(5):840-53. doi: 10.1109/TPAMI.2007.1058.
9
Error Tree: A Tree Structure for Hamming and Edit Distances and Wildcards Matching.错误树:用于汉明距离、编辑距离和通配符匹配的树结构。
J Comput Biol. 2015 Dec;22(12):1118-28. doi: 10.1089/cmb.2015.0132. Epub 2015 Sep 24.
10
Protein β-sheet prediction using an efficient dynamic programming algorithm.使用高效动态规划算法进行蛋白质β折叠预测。
Comput Biol Chem. 2017 Oct;70:142-155. doi: 10.1016/j.compbiolchem.2017.08.011. Epub 2017 Aug 24.

引用本文的文献

1
Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing.CUDA流中的性能建模——一种实现高吞吐量数据处理的方法。
Proc IEEE Int Conf Big Data. 2014 Oct;2014:301-310. doi: 10.1109/BigData.2014.7004245.
2
Efficient SDH Computation In Molecular Simulations Data.分子模拟数据中的高效SDH计算
ACM BCB. 2012 Oct;2012:527-529. doi: 10.1145/2382936.2383010.
3
DCMS: A data analytics and management system for molecular simulation.DCMS:一种用于分子模拟的数据分析与管理系统。
J Big Data. 2015;2(1):9. doi: 10.1186/s40537-014-0009-5. Epub 2014 Nov 26.
4
Computing Spatial Distance Histograms for Large Scientific Datasets On-the-Fly.动态计算大型科学数据集的空间距离直方图
IEEE Trans Knowl Data Eng. 2014 Oct;26(10):2410-2424. doi: 10.1109/TKDE.2014.2298015.
5
Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees.具有精度保证的空间距离直方图计算近似算法。
IEEE Trans Knowl Data Eng. 2012 Sep 1;25(9):1982-1996. doi: 10.1109/TKDE.2012.149.
6
Distance Histogram Computation Based on Spatiotemporal Uniformity in Scientific Data.基于科学数据时空均匀性的距离直方图计算
Adv Database Technol. 2012. doi: 10.1145/2247596.2247631.

本文引用的文献

1
GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.GROMACS 4:高效、负载均衡和可扩展的分子模拟算法。
J Chem Theory Comput. 2008 Mar;4(3):435-47. doi: 10.1021/ct700301q.
2
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.使用RNA关系数据库管理系统进行碱基相互作用预测的协变进化事件分析
Sci Stat Database Manag. 2009;5566:200-216. doi: 10.1007/978-3-642-02279-1_15.
3
Simulations of the formation, evolution and clustering of galaxies and quasars.星系和类星体的形成、演化及聚类模拟。
Nature. 2005 Jun 2;435(7042):629-36. doi: 10.1038/nature03597.
4
The role of declarative querying in bioinformatics.声明式查询在生物信息学中的作用。
OMICS. 2003 Spring;7(1):89-91. doi: 10.1089/153623103322006670.
5
Pathways database system: an integrated system for biological pathways.通路数据库系统:一个用于生物通路的集成系统。
Bioinformatics. 2003 May 22;19(8):930-7. doi: 10.1093/bioinformatics/btg113.
6
DSMM: a Database of Simulated Molecular Motions.DSMM:模拟分子运动数据库。
Nucleic Acids Res. 2003 Jan 1;31(1):456-7. doi: 10.1093/nar/gkg113.