Suppr超能文献

通过样本选择偏差校正提高数据驱动损伤模型的可转移性

Improved Transferability of Data-Driven Damage Models Through Sample Selection Bias Correction.

作者信息

Wagenaar Dennis, Hermawan Tiaravanni, van den Homberg Marc J C, Aerts Jeroen C J H, Kreibich Heidi, de Moel Hans, Bouwer Laurens M

机构信息

Deltares, Delft, The Netherlands.

Institute for Environmental Studies, VU University Amsterdam, The Netherlands.

出版信息

Risk Anal. 2021 Jan;41(1):37-55. doi: 10.1111/risa.13575. Epub 2020 Aug 24.

Abstract

Damage models for natural hazards are used for decision making on reducing and transferring risk. The damage estimates from these models depend on many variables and their complex sometimes nonlinear relationships with the damage. In recent years, data-driven modeling techniques have been used to capture those relationships. The available data to build such models are often limited. Therefore, in practice it is usually necessary to transfer models to a different context. In this article, we show that this implies the samples used to build the model are often not fully representative for the situation where they need to be applied on, which leads to a "sample selection bias." In this article, we enhance data-driven damage models by applying methods, not previously applied to damage modeling, to correct for this bias before the machine learning (ML) models are trained. We demonstrate this with case studies on flooding in Europe, and typhoon wind damage in the Philippines. Two sample selection bias correction methods from the ML literature are applied and one of these methods is also adjusted to our problem. These three methods are combined with stochastic generation of synthetic damage data. We demonstrate that for both case studies, the sample selection bias correction techniques reduce model errors, especially for the mean bias error this reduction can be larger than 30%. The novel combination with stochastic data generation seems to enhance these techniques. This shows that sample selection bias correction methods are beneficial for damage model transfer.

摘要

自然灾害的损失模型用于降低和转移风险的决策。这些模型的损失估计取决于许多变量以及它们与损失之间复杂的有时是非线性的关系。近年来,数据驱动的建模技术已被用于捕捉这些关系。用于构建此类模型的可用数据通常有限。因此,在实践中通常有必要将模型转移到不同的背景中。在本文中,我们表明这意味着用于构建模型的样本通常不能完全代表需要应用该模型的情况,这会导致“样本选择偏差”。在本文中,我们通过应用以前未应用于损失建模的方法来增强数据驱动的损失模型,以便在训练机器学习(ML)模型之前纠正这种偏差。我们通过欧洲洪水和菲律宾台风风灾的案例研究来证明这一点。应用了机器学习文献中的两种样本选择偏差校正方法,其中一种方法也针对我们的问题进行了调整。这三种方法与合成损失数据的随机生成相结合。我们证明,对于这两个案例研究,样本选择偏差校正技术都能减少模型误差,特别是对于平均偏差误差,这种减少幅度可能超过30%。与随机数据生成的新颖结合似乎增强了这些技术。这表明样本选择偏差校正方法对损失模型转移是有益的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f82f/7891600/c1132374a967/RISA-41-37-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验