• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索增加基于机器学习的城市内涝易发性评估可靠正样本的有效方法。

Exploring effective ways to increase reliable positive samples for machine learning-based urban waterlogging susceptibility assessments.

机构信息

Guangdong Province Key Laboratory for Land Use and Consolidation, South China Agricultural University, Guangzhou 510642, China; College of Natural Resources and Environment, Joint Institute for Environment & Education, South China Agricultural University, Guangzhou 510642, China.

College of Natural Resources and Environment, Joint Institute for Environment & Education, South China Agricultural University, Guangzhou 510642, China.

出版信息

J Environ Manage. 2023 Oct 15;344:118682. doi: 10.1016/j.jenvman.2023.118682. Epub 2023 Aug 9.

DOI:10.1016/j.jenvman.2023.118682
PMID:37567005
Abstract

Machine learning (ML)-based urban waterlogging susceptibility studies suffer from class imbalance, as fewer positive samples are generally available than potential negative samples. Few studies have considered optimizing the results by improving the quality of training samples. To address this issue, we explored effective approaches to reliably increase the numbers of positive samples for such studies. The Synthetic Minority Over-Sampling Technique (SMOTE) and Optimized Seed Spread Algorithm (OSSA), representative of oversampling (synthesizing new samples based on the feature space) and physical (simulating potential inundated area based on the mechanisms of water flow) approaches, respectively, were employed to increase the number of positive samples. Waterlogging in Shenzhen was selected as a case study using eight selected spatial variables. An elaborate experiment was conducted to compare the quality of added samples based on the classifiers' performance and accuracy of waterlogging susceptibility maps (WSMs). The results indicated that (1) the performance of classifiers generated with SMOTE was worse than the original samples, while the use of OSSA improved the trained classifiers, and (2) the accuracy of WSMs was not improved with SMOTE but increased markedly with OSSA. These results may be driven by the diversity of information and features of the added samples. This study indicates the use of SMOTE fails to synthesize reliable samples when applied to waterlogging analysis in Shenzhen, whereas an effective solution for generating reliable positive samples is to use OSSA that simulates the potential submerged regions based on the mechanisms of disaster occurrence and spread.

摘要

基于机器学习 (ML) 的城市内涝易发性研究存在类别不平衡问题,因为正样本通常比潜在的负样本少。很少有研究考虑通过改进训练样本的质量来优化结果。为了解决这个问题,我们探讨了有效方法,以可靠地增加此类研究的正样本数量。过采样(基于特征空间合成新样本)和物理方法(基于水流机制模拟潜在淹没区)的代表性方法——合成少数过采样技术 (SMOTE) 和优化种子传播算法 (OSSA) 分别被用于增加正样本数量。选取深圳市内涝作为案例研究,选用了八个选定的空间变量。进行了精心的实验,比较了基于分类器性能和内涝易发性图 (WSM) 准确性的添加样本的质量。结果表明:(1) 使用 SMOTE 生成的分类器的性能不如原始样本,而使用 OSSA 则改进了训练分类器;(2) SMOTE 对内涝易发性图的准确性没有提高,而 OSSA 则显著提高。这些结果可能是由添加样本的信息和特征多样性驱动的。本研究表明,在应用于深圳内涝分析时,SMOTE 无法合成可靠的样本,而使用 OSSA 根据灾害发生和传播机制模拟潜在淹没区是生成可靠正样本的有效方法。

相似文献

1
Exploring effective ways to increase reliable positive samples for machine learning-based urban waterlogging susceptibility assessments.探索增加基于机器学习的城市内涝易发性评估可靠正样本的有效方法。
J Environ Manage. 2023 Oct 15;344:118682. doi: 10.1016/j.jenvman.2023.118682. Epub 2023 Aug 9.
2
A Synthetic Minority Oversampling Technique Based on Gaussian Mixture Model Filtering for Imbalanced Data Classification.一种基于高斯混合模型滤波的合成少数类过采样技术用于不平衡数据分类
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3740-3753. doi: 10.1109/TNNLS.2022.3197156. Epub 2024 Feb 29.
3
Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China.利用 SMOTE 优化机器学习方法在浙江省丽水市滑坡易发性制图中的预测能力。
Int J Environ Res Public Health. 2019 Jan 28;16(3):368. doi: 10.3390/ijerph16030368.
4
A novel method for detecting credit card fraud problems.一种用于检测信用卡欺诈问题的新方法。
PLoS One. 2024 Mar 6;19(3):e0294537. doi: 10.1371/journal.pone.0294537. eCollection 2024.
5
Enhancing and improving the performance of imbalanced class data using novel GBO and SSG: A comparative analysis.利用新型 GBO 和 SSG 增强和改进不平衡类数据的性能:比较分析。
Neural Netw. 2024 May;173:106157. doi: 10.1016/j.neunet.2024.106157. Epub 2024 Feb 2.
6
Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms.特征分组划分:一种使用机器学习算法进行抑郁严重程度预测和类别平衡的方法。
BMC Med Res Methodol. 2024 Jun 3;24(1):123. doi: 10.1186/s12874-024-02249-8.
7
A self-inspected adaptive SMOTE algorithm (SASMOTE) for highly imbalanced data classification in healthcare.一种用于医疗保健中高度不平衡数据分类的自检测自适应合成少数过采样技术算法(SASMOTE)。
BioData Min. 2023 Apr 25;16(1):15. doi: 10.1186/s13040-023-00330-4.
8
Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms.使用改进的SMOTE与堆叠深度学习算法相结合有效处理医疗保健中的不平衡数据集。
Appl Nanosci. 2023;13(3):1829-1840. doi: 10.1007/s13204-021-02063-4. Epub 2022 Feb 3.
9
Comparing Sampling Strategies for Tackling Imbalanced Data in Human Activity Recognition.比较处理人体活动识别中不平衡数据的采样策略。
Sensors (Basel). 2022 Feb 11;22(4):1373. doi: 10.3390/s22041373.
10
SMOTE for high-dimensional class-imbalanced data.过采样处理高维类别不平衡数据。
BMC Bioinformatics. 2013 Mar 22;14:106. doi: 10.1186/1471-2105-14-106.

引用本文的文献

1
Rainstorm Disaster Risk Assessment and Influence Factors Analysis in the Yangtze River Delta, China.中国长三角地区暴雨灾害风险评估及影响因素分析。
Int J Environ Res Public Health. 2022 Aug 2;19(15):9497. doi: 10.3390/ijerph19159497.