• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用人体姿态估计关键点数据对仅编码器的Transformer进行手语建模时的优化限制

Constraints on Optimising Encoder-Only Transformers for Modelling Sign Language with Human Pose Estimation Keypoint Data.

作者信息

Woods Luke T, Rana Zeeshan A

机构信息

Digital Aviation Research and Technology Centre (DARTeC), Cranfield University, Cranfield, Bedfordshire MK43 0AL, UK.

Leidos Industrial Engineers Limited, Unit 3, Bedford Link Logistics Park, Bell Farm Way, Kempston, Bedfordshire MK43 9SS, UK.

出版信息

J Imaging. 2023 Nov 2;9(11):238. doi: 10.3390/jimaging9110238.

DOI:10.3390/jimaging9110238
PMID:37998085
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10672608/
Abstract

Supervised deep learning models can be optimised by applying regularisation techniques to reduce overfitting, which can prove difficult when fine tuning the associated hyperparameters. Not all hyperparameters are equal, and understanding the effect each hyperparameter and regularisation technique has on the performance of a given model is of paramount importance in research. We present the first comprehensive, large-scale ablation study for an encoder-only transformer to model sign language using the improved Word-level American Sign Language dataset (WLASL-alt) and human pose estimation keypoint data, with a view to put constraints on the potential to optimise the task. We measure the impact a range of model parameter regularisation and data augmentation techniques have on sign classification accuracy. We demonstrate that within the quoted uncertainties, other than ℓ2 parameter regularisation, none of the regularisation techniques we employ have an appreciable positive impact on performance, which we find to be in contradiction to results reported by other similar, albeit smaller scale, studies. We also demonstrate that the model architecture is bounded by the small dataset size for this task over finding an appropriate set of model parameter regularisation and common or basic dataset augmentation techniques. Furthermore, using the base model configuration, we report a new maximum top-1 classification accuracy of 84% on 100 signs, thereby improving on the previous benchmark result for this model architecture and dataset.

摘要

监督式深度学习模型可以通过应用正则化技术来优化,以减少过拟合,而在微调相关超参数时,这可能会很困难。并非所有超参数都是等同的,在研究中,了解每个超参数和正则化技术对给定模型性能的影响至关重要。我们针对仅编码器的变压器提出了首个全面、大规模的消融研究,使用改进的词级美国手语数据集(WLASL-alt)和人体姿态估计关键点数据来对手语进行建模,旨在对优化该任务的潜力加以限制。我们测量了一系列模型参数正则化和数据增强技术对手语分类准确率的影响。我们证明,在所述的不确定性范围内,除了ℓ2参数正则化之外,我们采用的正则化技术均未对性能产生明显的积极影响,我们发现这与其他类似但规模较小的研究所报告的结果相矛盾。我们还证明,对于此任务,模型架构受小数据集大小的限制,而不是找到一组合适的模型参数正则化和常见或基本的数据集增强技术。此外,使用基础模型配置,我们报告了在100个手语上的新的最高top-1分类准确率为84%,从而改进了该模型架构和数据集的先前基准结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c897508f6902/jimaging-09-00238-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c06d8839daee/jimaging-09-00238-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/f2579f10e744/jimaging-09-00238-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/52f531f4f2a6/jimaging-09-00238-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/fad31a03a124/jimaging-09-00238-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/0f9f51139c56/jimaging-09-00238-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/a38fd07581c8/jimaging-09-00238-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c9e2eccafc53/jimaging-09-00238-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/5edfbf9c573f/jimaging-09-00238-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/e61224fc4f2c/jimaging-09-00238-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/6ebdc9a5c9d6/jimaging-09-00238-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/8da2b32c9777/jimaging-09-00238-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/3804d2ef5978/jimaging-09-00238-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/5b473560c943/jimaging-09-00238-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/0264ee9f1dfa/jimaging-09-00238-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/a3b3736fab28/jimaging-09-00238-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/8a04d3e492be/jimaging-09-00238-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c897508f6902/jimaging-09-00238-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c06d8839daee/jimaging-09-00238-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/f2579f10e744/jimaging-09-00238-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/52f531f4f2a6/jimaging-09-00238-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/fad31a03a124/jimaging-09-00238-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/0f9f51139c56/jimaging-09-00238-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/a38fd07581c8/jimaging-09-00238-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c9e2eccafc53/jimaging-09-00238-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/5edfbf9c573f/jimaging-09-00238-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/e61224fc4f2c/jimaging-09-00238-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/6ebdc9a5c9d6/jimaging-09-00238-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/8da2b32c9777/jimaging-09-00238-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/3804d2ef5978/jimaging-09-00238-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/5b473560c943/jimaging-09-00238-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/0264ee9f1dfa/jimaging-09-00238-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/a3b3736fab28/jimaging-09-00238-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/8a04d3e492be/jimaging-09-00238-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/663b/10672608/c897508f6902/jimaging-09-00238-g017.jpg

相似文献

1
Constraints on Optimising Encoder-Only Transformers for Modelling Sign Language with Human Pose Estimation Keypoint Data.使用人体姿态估计关键点数据对仅编码器的Transformer进行手语建模时的优化限制
J Imaging. 2023 Nov 2;9(11):238. doi: 10.3390/jimaging9110238.
2
Sign2Pose: A Pose-Based Approach for Gloss Prediction Using a Transformer Model.Sign2Pose:一种基于姿势的方法,使用转换器模型进行 Gloss 预测。
Sensors (Basel). 2023 Mar 6;23(5):2853. doi: 10.3390/s23052853.
3
BertSRC: transformer-based semantic relation classification.BertSRC:基于转换器的语义关系分类。
BMC Med Inform Decis Mak. 2022 Sep 6;22(1):234. doi: 10.1186/s12911-022-01977-5.
4
Transformers-sklearn: a toolkit for medical language understanding with transformer-based models.Transformer-sklearn:一个基于 Transformer 的模型的医学语言理解工具包。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):90. doi: 10.1186/s12911-021-01459-0.
5
Cofopose: Conditional 2D Pose Estimation with Transformers.Cofopose:基于 Transformer 的条件 2D 姿态估计。
Sensors (Basel). 2022 Sep 9;22(18):6821. doi: 10.3390/s22186821.
6
Automated sign language detection and classification using reptile search algorithm with hybrid deep learning.使用带有混合深度学习的爬虫搜索算法进行自动手语检测与分类
Heliyon. 2023 Dec 8;10(1):e23252. doi: 10.1016/j.heliyon.2023.e23252. eCollection 2024 Jan 15.
7
ViTPose++: Vision Transformer for Generic Body Pose Estimation.ViTPose++:用于通用人体姿态估计的视觉Transformer
IEEE Trans Pattern Anal Mach Intell. 2024 Feb;46(2):1212-1230. doi: 10.1109/TPAMI.2023.3330016. Epub 2024 Jan 8.
8
SignNet II: A Transformer-Based Two-Way Sign Language Translation Model.SignNet II:一种基于Transformer的双向手语翻译模型。
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):12896-12907. doi: 10.1109/TPAMI.2022.3232389. Epub 2023 Oct 3.
9
Single-view multi-human pose estimation by attentive cross-dimension matching.通过注意力跨维度匹配实现单视图多人姿态估计
Front Neurosci. 2023 Jul 19;17:1201088. doi: 10.3389/fnins.2023.1201088. eCollection 2023.
10
Multi-speed transformer network for neurodegenerative disease assessment and activity recognition.用于神经退行性疾病评估和活动识别的多速变压器网络。
Comput Methods Programs Biomed. 2023 Mar;230:107344. doi: 10.1016/j.cmpb.2023.107344. Epub 2023 Jan 9.

引用本文的文献

1
SSTA-ResT: Soft Spatiotemporal Attention ResNet Transformer for Argentine Sign Language Recognition.SSTA-ResT:用于阿根廷手语识别的软时空注意力残差网络变压器
Sensors (Basel). 2025 Sep 5;25(17):5543. doi: 10.3390/s25175543.
2
Toward a Recognition System for Mexican Sign Language: Arm Movement Detection.迈向墨西哥手语识别系统:手臂动作检测
Sensors (Basel). 2025 Jun 10;25(12):3636. doi: 10.3390/s25123636.
3
Enhancing Aircraft Safety through Advanced Engine Health Monitoring with Long Short-Term Memory.通过使用长短期记忆网络的先进发动机健康监测提高飞机安全性。

本文引用的文献

1
Sign2Pose: A Pose-Based Approach for Gloss Prediction Using a Transformer Model.Sign2Pose:一种基于姿势的方法,使用转换器模型进行 Gloss 预测。
Sensors (Basel). 2023 Mar 6;23(5):2853. doi: 10.3390/s23052853.
2
Sign and Human Action Detection Using Deep Learning.基于深度学习的手势与人体动作检测
J Imaging. 2022 Jul 11;8(7):192. doi: 10.3390/jimaging8070192.
3
A comprehensive survey of recent trends in deep learning for digital images augmentation.对数字图像增强的深度学习近期趋势的全面调查。
Sensors (Basel). 2024 Jan 14;24(2):0. doi: 10.3390/s24020518.
Artif Intell Rev. 2022;55(3):2351-2377. doi: 10.1007/s10462-021-10066-4. Epub 2021 Sep 4.
4
Text Data Augmentation for Deep Learning.用于深度学习的文本数据增强
J Big Data. 2021;8(1):101. doi: 10.1186/s40537-021-00492-0. Epub 2021 Jul 19.
5
An empirical survey of data augmentation for time series classification with neural networks.基于神经网络的时间序列分类中数据增强的实证研究。
PLoS One. 2021 Jul 15;16(7):e0254841. doi: 10.1371/journal.pone.0254841. eCollection 2021.
6
A review of medical image data augmentation techniques for deep learning applications.医学图像数据增强技术在深度学习应用中的综述。
J Med Imaging Radiat Oncol. 2021 Aug;65(5):545-563. doi: 10.1111/1754-9485.13261. Epub 2021 Jun 19.
7
L1 -Norm Batch Normalization for Efficient Training of Deep Neural Networks.L1-范数批归一化在深度神经网络高效训练中的应用。
IEEE Trans Neural Netw Learn Syst. 2019 Jul;30(7):2043-2051. doi: 10.1109/TNNLS.2018.2876179. Epub 2018 Nov 9.
8
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
9
Deep learning in neural networks: an overview.神经网络中的深度学习:综述。
Neural Netw. 2015 Jan;61:85-117. doi: 10.1016/j.neunet.2014.09.003. Epub 2014 Oct 13.