H V Sreepathy, Rao B Dinesh, J Mohan Kumar, Rao B Deepak
Manipal School of Information Sciences, MAHE, Manipal, 576104, India.
MethodsX. 2023 Jun 22;11:102262. doi: 10.1016/j.mex.2023.102262. eCollection 2023 Dec.
Floods are the most common natural disaster in several countries throughout the world. Flooding has a major impact on people's lives and livelihoods. The impact of flood disasters on human lives can be mitigated by developing effective flood forecasting and prediction models. The majority of flood prediction models do not take all flood-causing factors into account when they are designed. It is difficult to collect and handle some of these flood-causing variables since they are heterogeneous in nature. This paper presents a new big data architecture called Data Lake, which can ingest and store all important flood-causing heterogeneous data sources in their raw format for machine learning model creation. The statistical relevance of important flood producing factors on flood prediction outcome is determined utilizing inferential statistical approaches. The outcome of this research is to create flood warning systems that can alert the public and government officials so that they can make decisions in the event of a severe flood, reducing socioeconomic loss. •Flood causing factors are from heterogeneous sources, so there is no big data architecture for handling variety of data sources.•To provide data architectural solution using data lake for collecting and analysing heterogeneous flood causing factors.•Uses inferential statistical approach to determine importance of different flood causing factors in design of efficient flood prediction models.
洪水是世界上多个国家最常见的自然灾害。洪水对人们的生活和生计有着重大影响。通过开发有效的洪水预报和预测模型,可以减轻洪水灾害对人类生命的影响。大多数洪水预测模型在设计时并未考虑所有致洪因素。由于其中一些致洪变量本质上具有异质性,因此难以收集和处理。本文提出了一种名为数据湖的新大数据架构,它可以摄取并以原始格式存储所有重要的异质性致洪数据源,用于创建机器学习模型。利用推断统计方法确定重要致洪因素对洪水预测结果的统计相关性。本研究的成果是创建洪水预警系统,该系统可以向公众和政府官员发出警报,以便他们在发生严重洪水时能够做出决策,减少社会经济损失。
•致洪因素来自异质源,因此不存在用于处理各种数据源的大数据架构。
•提供使用数据湖的数据架构解决方案,用于收集和分析异质性致洪因素。
•使用推断统计方法确定不同致洪因素在高效洪水预测模型设计中的重要性。