• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

为基于深度学习的视频分类器确定测试用例的优先级。

Prioritizing test cases for deep learning-based video classifiers.

作者信息

Li Yinghua, Dang Xueqi, Ma Lei, Klein Jacques, Bissyandé Tegawendé F

机构信息

SnT Centre, University of Luxembourg, Esch-sur-Alzette, Luxembourg.

The University of Tokyo, Tokyo, Japan.

出版信息

Empir Softw Eng. 2024;29(5):111. doi: 10.1007/s10664-024-10520-1. Epub 2024 Jul 22.

DOI:10.1007/s10664-024-10520-1
PMID:39247128
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11377581/
Abstract

The widespread adoption of video-based applications across various fields highlights their importance in modern software systems. However, in comparison to images or text, labelling video test cases for the purpose of assessing system accuracy can lead to increased expenses due to their temporal structure and larger volume. Test prioritization has emerged as a promising approach to mitigate the labeling cost, which prioritizes potentially misclassified test inputs so that such inputs can be identified earlier with limited time and manual labeling efforts. However, applying existing prioritization techniques to video test cases faces certain limitations: they do not account for the unique temporal information present in video data. Unlike static image datasets that only contain spatial information, video inputs consist of multiple frames that capture the dynamic changes of objects over time. In this paper, we propose VRank, the first test prioritization approach designed specifically for video test inputs. The fundamental idea behind VRank is that video-type tests with a higher probability of being misclassified by the evaluated DNN classifier are considered more likely to reveal faults and will be prioritized higher. To this end, we train a ranking model with the aim of predicting the probability of a given test input being misclassified by a DNN classifier. This prediction relies on four types of generated features: temporal features (TF), video embedding features (EF), prediction features (PF), and uncertainty features (UF). We rank all test inputs in the target test set based on their misclassification probabilities. Videos with a higher likelihood of being misclassified will be prioritized higher. We conducted an empirical evaluation to assess the performance of VRank, involving 120 subjects with both natural and noisy datasets. The experimental results reveal VRank outperforms all compared test prioritization methods, with an average improvement of 5.76% 46.51% on natural datasets and 4.26% 53.56% on noisy datasets.

摘要

基于视频的应用在各个领域的广泛采用凸显了它们在现代软件系统中的重要性。然而,与图像或文本相比,为评估系统准确性而标记视频测试用例因其时间结构和更大的体量可能会导致成本增加。测试优先级排序已成为一种有前景的方法来减轻标记成本,它对潜在误分类的测试输入进行优先级排序,以便在有限的时间和人工标记工作下能更早地识别出此类输入。然而,将现有的优先级排序技术应用于视频测试用例存在一定局限性:它们没有考虑视频数据中存在的独特时间信息。与仅包含空间信息的静态图像数据集不同,视频输入由多个帧组成,这些帧捕捉了对象随时间的动态变化。在本文中,我们提出了VRank,这是第一种专门为视频测试输入设计的测试优先级排序方法。VRank背后的基本思想是,被评估的深度神经网络(DNN)分类器误分类概率较高的视频类型测试更有可能揭示故障,因此将被赋予更高的优先级。为此,我们训练了一个排序模型,目的是预测给定测试输入被DNN分类器误分类的概率。这种预测依赖于四种生成的特征:时间特征(TF)、视频嵌入特征(EF)、预测特征(PF)和不确定性特征(UF)。我们根据所有测试输入的误分类概率对目标测试集中的所有测试输入进行排序。被误分类可能性更高的视频将被赋予更高的优先级。我们进行了实证评估以评估VRank的性能,涉及120个自然数据集和噪声数据集的受试者。实验结果表明,VRank优于所有比较的测试优先级排序方法,在自然数据集上平均提高了5.76%至46.51%,在噪声数据集上提高了4.26%至53.56%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0434/11377581/91c0697b6cd1/10664_2024_10520_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0434/11377581/91c0697b6cd1/10664_2024_10520_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0434/11377581/91c0697b6cd1/10664_2024_10520_Fig1_HTML.jpg

相似文献

1
Prioritizing test cases for deep learning-based video classifiers.为基于深度学习的视频分类器确定测试用例的优先级。
Empir Softw Eng. 2024;29(5):111. doi: 10.1007/s10664-024-10520-1. Epub 2024 Jul 22.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?胚胎学家在评估囊胚着床概率时的意见一致:数据驱动的预测是否是解决胚胎评估主观性的方法?
Hum Reprod. 2022 Sep 30;37(10):2275-2290. doi: 10.1093/humrep/deac171.
4
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
5
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
6
Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network.使用Transformer网络对小肠胶囊内镜进行视频分析
Diagnostics (Basel). 2023 Oct 5;13(19):3133. doi: 10.3390/diagnostics13193133.
7
Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning.利用图像进行视频识别:通过对称对抗学习实现异构特征增强
IEEE Trans Image Process. 2019 Nov;28(11):5308-5321. doi: 10.1109/TIP.2019.2917867. Epub 2019 May 24.
8
MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL:一种用于机器辅助骨龄标注的新型深度学习架构。
J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.
9
Self-supervised enhanced thyroid nodule detection in ultrasound examination video sequences with multi-perspective evaluation.基于多视角评估的超声检查视频序列中增强型甲状腺结节的自监督检测。
Phys Med Biol. 2023 Nov 28;68(23). doi: 10.1088/1361-6560/ad092a.
10
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

本文引用的文献

1
Text Data Augmentation for Deep Learning.用于深度学习的文本数据增强
J Big Data. 2021;8(1):101. doi: 10.1186/s40537-021-00492-0. Epub 2021 Jul 19.
2
T test as a parametric statistic.T检验作为一种参数统计方法。
Korean J Anesthesiol. 2015 Dec;68(6):540-6. doi: 10.4097/kjae.2015.68.6.540. Epub 2015 Nov 25.
3
On effect size.关于效应量。
Psychol Methods. 2012 Jun;17(2):137-52. doi: 10.1037/a0028086. Epub 2012 Apr 30.
4
Testing and Validating Machine Learning Classifiers by Metamorphic Testing.通过变形测试对机器学习分类器进行测试与验证。
J Syst Softw. 2011 Apr 1;84(4):544-558. doi: 10.1016/j.jss.2010.11.920.