• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过主动生成过采样解决碰撞风险预测中的数据不平衡问题。

Addressing data imbalance in collision risk prediction with active generative oversampling.

作者信息

Li Li, Zhang Xiaoliang

机构信息

Information Engineering School, Jiaozuo Normal College, Jiaozuo, 454000, China.

出版信息

Sci Rep. 2025 Mar 17;15(1):9133. doi: 10.1038/s41598-025-93851-3.

DOI:10.1038/s41598-025-93851-3
PMID:40097620
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11914271/
Abstract

Data imbalance is a critical factor affecting the predictive accuracy in collision risk assessment. This study proposes an advanced active generative oversampling method based on Query by Committee (QBC) and Auxiliary Classifier Generative Adversarial Network (ACGAN), integrated with the Wasserstein Generative Adversarial Network (WGAN) framework. Our method selectively enriches minority class samples through QBC and diversity metrics to enhance the diversity of sample generation, thereby improving the performance of fault classification algorithms. By equating the labels of selected samples to those of real samples, we increase the accuracy of the discriminator, forcing the generator to produce more diverse outputs, which is expected to improve classification results. We also propose a method for dynamically adjusting the training epochs of the generator and discriminator based on loss differences to achieve balance in model training. Empirical analysis on four publicly available imbalanced datasets shows that our method outperforms existing methods in terms of precision, recall, F-measure, and G-mean. Specifically, our method's results are above 0.92 on all evaluation indicators, with an average improvement of 23-28.3% compared to the worst-performing ENN method. This indicates that our method has a significant advantage in handling data imbalance, being able to more accurately identify collision samples and reduce the misclassification rate of non-collision samples.

摘要

数据不平衡是影响碰撞风险评估预测准确性的关键因素。本研究提出了一种基于委员会查询(QBC)和辅助分类器生成对抗网络(ACGAN)的先进主动生成过采样方法,并与瓦瑟斯坦生成对抗网络(WGAN)框架相结合。我们的方法通过QBC和多样性度量有选择地丰富少数类样本,以增强样本生成的多样性,从而提高故障分类算法的性能。通过将所选样本的标签与真实样本的标签等同起来,我们提高了判别器的准确性,迫使生成器产生更多样化的输出,这有望改善分类结果。我们还提出了一种基于损失差异动态调整生成器和判别器训练轮次的方法,以实现模型训练的平衡。对四个公开可用的不平衡数据集的实证分析表明,我们的方法在精度、召回率、F值和G均值方面优于现有方法。具体而言,我们的方法在所有评估指标上的结果均高于0.92,与表现最差的ENN方法相比,平均提高了23 - 28.3%。这表明我们的方法在处理数据不平衡方面具有显著优势,能够更准确地识别碰撞样本并降低非碰撞样本的误分类率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/ef973cd7b54a/41598_2025_93851_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/1cb0955fd1fb/41598_2025_93851_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/9b729e8c1718/41598_2025_93851_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/42a9bebed673/41598_2025_93851_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/708c0a503f66/41598_2025_93851_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/e92e79991bea/41598_2025_93851_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/cf5dfb3fb436/41598_2025_93851_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/bb800d369e22/41598_2025_93851_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/9f14aee86b28/41598_2025_93851_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/ef973cd7b54a/41598_2025_93851_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/1cb0955fd1fb/41598_2025_93851_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/9b729e8c1718/41598_2025_93851_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/42a9bebed673/41598_2025_93851_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/708c0a503f66/41598_2025_93851_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/e92e79991bea/41598_2025_93851_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/cf5dfb3fb436/41598_2025_93851_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/bb800d369e22/41598_2025_93851_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/9f14aee86b28/41598_2025_93851_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08dd/11914271/ef973cd7b54a/41598_2025_93851_Fig9_HTML.jpg

相似文献

1
Addressing data imbalance in collision risk prediction with active generative oversampling.通过主动生成过采样解决碰撞风险预测中的数据不平衡问题。
Sci Rep. 2025 Mar 17;15(1):9133. doi: 10.1038/s41598-025-93851-3.
2
A new imbalanced data oversampling method based on Bootstrap method and Wasserstein Generative Adversarial Network.一种基于自助法和瓦瑟斯坦生成对抗网络的新型不平衡数据过采样方法。
Math Biosci Eng. 2024 Feb 26;21(3):4309-4327. doi: 10.3934/mbe.2024190.
3
The Novel Sensor Network Structure for Classification Processing Based on the Machine Learning Method of the ACGAN.基于ACGAN机器学习方法的用于分类处理的新型传感器网络结构
Sensors (Basel). 2019 Jul 17;19(14):3145. doi: 10.3390/s19143145.
4
Sample Augmentation Using Enhanced Auxiliary Classifier Generative Adversarial Network by Transformer for Railway Freight Train Wheelset Bearing Fault Diagnosis.基于Transformer的增强辅助分类器生成对抗网络的样本增强在铁路货车轮对轴承故障诊断中的应用
Entropy (Basel). 2024 Dec 20;26(12):1113. doi: 10.3390/e26121113.
5
Cancer diagnosis using generative adversarial networks based on deep learning from imbalanced data.基于深度学习的生成对抗网络从不平衡数据中进行癌症诊断。
Comput Biol Med. 2021 Aug;135:104540. doi: 10.1016/j.compbiomed.2021.104540. Epub 2021 Jun 12.
6
A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。
Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.
7
Generative AI with WGAN-GP for boosting seizure detection accuracy.用于提高癫痫发作检测准确性的带有 Wasserstein 生成对抗网络梯度惩罚的生成式人工智能。
Front Artif Intell. 2024 Oct 2;7:1437315. doi: 10.3389/frai.2024.1437315. eCollection 2024.
8
An Imbalanced Generative Adversarial Network-Based Approach for Network Intrusion Detection in an Imbalanced Dataset.基于不平衡生成对抗网络的不平衡数据集网络入侵检测方法。
Sensors (Basel). 2023 Jan 3;23(1):550. doi: 10.3390/s23010550.
9
ACGAN for Addressing the Security Challenges in IoT-Based Healthcare System.基于 ACGAN 的物联网医疗系统安全挑战应对方案
Sensors (Basel). 2024 Oct 13;24(20):6601. doi: 10.3390/s24206601.
10
VAE-WACGAN: An Improved Data Augmentation Method Based on VAEGAN for Intrusion Detection.变分自编码器- Wasserstein对抗生成网络:一种基于变分自编码器-生成对抗网络的改进型入侵检测数据增强方法
Sensors (Basel). 2024 Sep 18;24(18):6035. doi: 10.3390/s24186035.

本文引用的文献

1
The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems.数据集失衡对 SCADA 入侵检测系统性能的影响。
Sensors (Basel). 2023 Jan 9;23(2):758. doi: 10.3390/s23020758.
2
A hybrid sampling algorithm combining synthetic minority over-sampling technique and edited nearest neighbor for missed abortion diagnosis.一种结合合成少数过采样技术和编辑最近邻的混合采样算法,用于诊断漏诊的流产。
BMC Med Inform Decis Mak. 2022 Dec 29;22(1):344. doi: 10.1186/s12911-022-02075-2.
3
Fast prototype selection algorithm based on adjacent neighbourhood and boundary approximation.
基于邻域和边界逼近的快速原型选择算法。
Sci Rep. 2022 Nov 22;12(1):20108. doi: 10.1038/s41598-022-23036-9.
4
Automated Diabetic Retinopathy Detection Using Horizontal and Vertical Patch Division-Based Pre-Trained DenseNET with Digital Fundus Images.使用基于水平和垂直补丁划分的预训练密集神经网络与数字眼底图像进行糖尿病视网膜病变自动检测
Diagnostics (Basel). 2022 Aug 15;12(8):1975. doi: 10.3390/diagnostics12081975.
5
Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset.基于不平衡数据集的机器故障分类的 Tomk Link 和 SMOTE 方法。
Sensors (Basel). 2022 Apr 23;22(9):3246. doi: 10.3390/s22093246.
6
A Novel Method for Identification of Glutarylation Sites Combining Borderline-SMOTE With Tomek Links Technique in Imbalanced Data.一种结合边缘-SMOTE 与 Tomek 链接技术的不平衡数据谷氨酰化位点鉴定新方法
IEEE/ACM Trans Comput Biol Bioinform. 2022 Sep-Oct;19(5):2632-2641. doi: 10.1109/TCBB.2021.3095482. Epub 2022 Oct 10.
7
Risk analysis of pulmonary metastasis of chondrosarcoma by establishing and validating a new clinical prediction model: a clinical study based on SEER database.建立并验证新的临床预测模型对软骨肉瘤肺转移风险的分析:基于 SEER 数据库的临床研究。
BMC Musculoskelet Disord. 2021 Jun 9;22(1):529. doi: 10.1186/s12891-021-04414-2.
8
Exploratory study on classification of diabetes mellitus through a combined Random Forest Classifier.基于随机森林分类器的糖尿病分类探索性研究。
BMC Med Inform Decis Mak. 2021 Mar 20;21(1):105. doi: 10.1186/s12911-021-01471-4.
9
Speech Emotion Recognition Based on Selective Interpolation Synthetic Minority Over-Sampling Technique in Small Sample Environment.基于选择性插值合成少数过采样技术的小样本环境下的语音情感识别。
Sensors (Basel). 2020 Apr 17;20(8):2297. doi: 10.3390/s20082297.