Suppr超能文献

大数据商业智能中的联机分析处理。

Online Analytical Processing for Business Intelligence in Big Data.

机构信息

Department of CSE, Institute of Technology, Nirma University, Ahmedabad, India.

Raksha Shakti University, Gandhinagar, India.

出版信息

Big Data. 2020 Dec;8(6):501-518. doi: 10.1089/big.2020.0045.

Abstract

Online analytical processing (OLAP) approach is widely used in business intelligence to cater the multidimensional queries for decades. In this era of cutting-edge technology and the internet, data generation rates have been rising exponentially. Internet of things sensors and social media platforms are some of the major contributors, leading toward the absolute data boom. Storage and speed are the crucial parameters and undoubtedly the burning issues in efficient data handling. The key idea here is to address these two challenges of big data computing in OLAP. In this article, the authors have proposed and implemented OLAP on Hadoop by Indexing (OOHI). OOHI offers a simplified multidimensional model that stores dimensions in the schema server and measures on the Hadoop cluster. Overall setup is divided into various modules, namely: data storage module (DSM), dimension encoding module (DEM), cube segmentation module, segment selection module (SSM), and block selection and process (BSAP) module. Serialization and deserialization concept applied by DSM for storage and retrieval of the data for efficient space utilization. Integer encoding adopted by DEM in dimension hierarchy is selected to escape sparsity problem in multidimensional big data. To reduce search space by chunks of the cube from the queried chunks, SSM plays an important role. Map reduce-based indexing approach and series of seek operations of BSAP module were integrated to achieve parallelism and fault tolerance. Real-time oceanography data and supermarket data sets are applied to demonstrate that OOHI model is data independent. Various test cases are designed to cover the scope of each dimension and volume of data set. Comparative results and performance analytics portray that OOHI outperforms in data storage, dice, slice, and roll-up operations compared with Hadoop based OLAP.

摘要

联机分析处理(OLAP)方法在商业智能中被广泛应用,以满足数十年来的多维查询。在这个技术前沿和互联网的时代,数据生成速度呈指数级增长。物联网传感器和社交媒体平台是主要贡献者之一,导致了绝对的数据繁荣。存储和速度是关键参数,也是高效数据处理中无疑的热点问题。这里的关键思想是解决 OLAP 中的大数据计算的这两个挑战。在本文中,作者通过索引(OOHI)提出并实现了 Hadoop 上的 OLAP。OOHI 提供了一个简化的多维模型,将维度存储在架构服务器中,度量值存储在 Hadoop 集群中。总体设置分为多个模块,分别是:数据存储模块(DSM)、维度编码模块(DEM)、多维数据集分割模块、分割选择模块(SSM)和块选择和处理(BSAP)模块。DSM 应用序列化和反序列化概念进行数据存储和检索,以提高空间利用率。在多维大数据中,采用 DEM 中的整数编码来避免维度层次结构的稀疏性问题。为了通过查询块的多维数据集块来减少搜索空间,SSM 发挥了重要作用。基于 MapReduce 的索引方法和 BSAP 模块的一系列查找操作集成在一起,以实现并行性和容错性。实时海洋学数据和超市数据集被应用来证明 OOHI 模型是与数据无关的。设计了各种测试用例来涵盖每个维度和数据集的规模。比较结果和性能分析表明,与基于 Hadoop 的 OLAP 相比,OOHI 在数据存储、骰子、切片和向上汇总操作方面表现更好。

相似文献

10
A similarity-based data warehousing environment for medical images.一种基于相似度的医学图像数据仓库环境。
Comput Biol Med. 2015 Nov 1;66:190-208. doi: 10.1016/j.compbiomed.2015.08.019. Epub 2015 Sep 5.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验