• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BAGLS,一个用于自动声门分割的多医院基准测试。

BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.

机构信息

Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Waldstraße 1, 91054, Erlangen, Germany.

Department of Head and Neck Surgery, David Geffen School of Medicine at the University of California, Los Angeles, Los Angeles, California, USA.

出版信息

Sci Data. 2020 Jun 19;7(1):186. doi: 10.1038/s41597-020-0526-3.

DOI:10.1038/s41597-020-0526-3
PMID:32561845
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7305104/
Abstract

Laryngeal videoendoscopy is one of the main tools in clinical examinations for voice disorders and voice research. Using high-speed videoendoscopy, it is possible to fully capture the vocal fold oscillations, however, processing the recordings typically involves a time-consuming segmentation of the glottal area by trained experts. Even though automatic methods have been proposed and the task is particularly suited for deep learning methods, there are no public datasets and benchmarks available to compare methods and to allow training of generalizing deep learning models. In an international collaboration of researchers from seven institutions from the EU and USA, we have created BAGLS, a large, multihospital dataset of 59,250 high-speed videoendoscopy frames with individually annotated segmentation masks. The frames are based on 640 recordings of healthy and disordered subjects that were recorded with varying technical equipment by numerous clinicians. The BAGLS dataset will allow an objective comparison of glottis segmentation methods and will enable interested researchers to train their own models and compare their methods.

摘要

喉视频内窥镜检查是嗓音障碍和嗓音研究临床检查的主要工具之一。使用高速视频内窥镜检查,我们可以充分捕捉声带的振动,但通常需要经过专业培训的专家来耗时地分割声门区域。尽管已经提出了一些自动方法,并且该任务特别适合深度学习方法,但目前还没有公共数据集和基准可供比较方法并允许训练泛化的深度学习模型。在来自欧盟和美国的七个机构的国际研究人员合作中,我们创建了 BAGLS,这是一个大型、多医院的数据集,包含 59250 个高速视频内窥镜检查帧,每个帧都有单独的标注分割掩模。这些帧基于 640 个健康和患病受试者的记录,由许多临床医生使用不同的技术设备进行录制。BAGLS 数据集将允许对声门分割方法进行客观比较,并使有兴趣的研究人员能够训练自己的模型并比较他们的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/a5988513bc50/41597_2020_526_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/dd8a4ab8ac4c/41597_2020_526_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/c0afe20c1aa6/41597_2020_526_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/b956dba535bb/41597_2020_526_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/a5988513bc50/41597_2020_526_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/dd8a4ab8ac4c/41597_2020_526_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/c0afe20c1aa6/41597_2020_526_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/b956dba535bb/41597_2020_526_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b6f/7305104/a5988513bc50/41597_2020_526_Fig4_HTML.jpg

相似文献

1
BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.BAGLS,一个用于自动声门分割的多医院基准测试。
Sci Data. 2020 Jun 19;7(1):186. doi: 10.1038/s41597-020-0526-3.
2
A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.基于喉高速视频内窥镜的深度学习方法定量分析连续语音中的声带动力学
J Speech Lang Hear Res. 2022 Jun 8;65(6):2098-2113. doi: 10.1044/2022_JSLHR-21-00540. Epub 2022 May 23.
3
Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.使用深度卷积长短期记忆网络对喉内窥镜高速视频中的声门和声带进行全自动分割。
PLoS One. 2020 Feb 10;15(2):e0227791. doi: 10.1371/journal.pone.0227791. eCollection 2020.
4
OpenHSV: an open platform for laryngeal high-speed videoendoscopy.OpenHSV:用于喉高速视频内窥镜检查的开放平台。
Sci Rep. 2021 Jul 2;11(1):13760. doi: 10.1038/s41598-021-93149-0.
5
GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks.GlottisNetV2:基于深度卷积神经网络的时频声带中线检测
IEEE J Transl Eng Health Med. 2023 Jan 19;11:137-144. doi: 10.1109/JTEHM.2023.3237859. eCollection 2023.
6
Automatic and quantitative measurement of laryngeal video stroboscopic images.喉视频频闪图像的自动定量测量
Proc Inst Mech Eng H. 2017 Jan;231(1):48-57. doi: 10.1177/0954411916679200. Epub 2016 Dec 21.
7
A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis.深度学习增强型新型喉动力学分析软件工具。
J Speech Lang Hear Res. 2021 Jun 4;64(6):1889-1903. doi: 10.1044/2021_JSLHR-20-00498. Epub 2021 May 17.
8
Rethinking glottal midline detection.重新思考声门中线检测。
Sci Rep. 2020 Nov 26;10(1):20723. doi: 10.1038/s41598-020-77216-6.
9
Intersegmenter Variability in High-Speed Laryngoscopy-Based Glottal Area Waveform Measures.高速喉镜下声门面积波测量的节段间可变性。
Laryngoscope. 2020 Nov;130(11):E654-E661. doi: 10.1002/lary.28475. Epub 2019 Dec 16.
10
Spatio-temporal quantification of vocal fold vibrations using high-speed videoendoscopy and a biomechanical model.使用高速视频内窥镜检查和生物力学模型对声带振动进行时空量化。
J Acoust Soc Am. 2008 May;123(5):2717-32. doi: 10.1121/1.2902167.

引用本文的文献

1
An automatic laryngoscopic image segmentation system based on SAM prompt engineering: from glottis annotation to vocal fold segmentation.基于SAM提示工程的自动喉镜图像分割系统:从声门标注到声带分割
Front Mol Biosci. 2025 Jul 10;12:1616271. doi: 10.3389/fmolb.2025.1616271. eCollection 2025.
2
Predicting semantic segmentation quality in laryngeal endoscopy images.预测喉镜检查图像中的语义分割质量。
PLoS One. 2025 Jul 3;20(7):e0314573. doi: 10.1371/journal.pone.0314573. eCollection 2025.
3
Machine learning based assessment of hoarseness severity: a multi-sensor approach centered on high-speed videoendoscopy.

本文引用的文献

1
Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network.使用深度卷积长短期记忆网络对喉内窥镜高速视频中的声门和声带进行全自动分割。
PLoS One. 2020 Feb 10;15(2):e0227791. doi: 10.1371/journal.pone.0227791. eCollection 2020.
2
A survey on deep learning in medical image analysis.深度学习在医学图像分析中的应用研究综述。
Med Image Anal. 2017 Dec;42:60-88. doi: 10.1016/j.media.2017.07.005. Epub 2017 Jul 26.
3
Glottal Gap tracking by a continuous background modeling using inpainting.
基于机器学习的声音嘶哑严重程度评估:一种以高速视频内镜检查为核心的多传感器方法。
Front Artif Intell. 2025 Jun 5;8:1601716. doi: 10.3389/frai.2025.1601716. eCollection 2025.
4
Immediate Effects of Nasalance Exercises on Patients with Organic Dysphonia.鼻共鸣练习对器质性发声障碍患者的即时影响。
J Otolaryngol Head Neck Surg. 2025 Jan-Dec;54:19160216251333360. doi: 10.1177/19160216251333360. Epub 2025 May 21.
5
Uncertainty of Spatial Segmentation of High-Speed Videoendoscopy and Its Temporal and Spatial Dependency.高速视频内镜检查空间分割的不确定性及其时空依赖性
J Voice. 2025 Mar 28. doi: 10.1016/j.jvoice.2025.03.007.
6
GIRAFE: Glottal imaging dataset for advanced segmentation, analysis, and facilitative playbacks evaluation.GIRAFE:用于高级分割、分析及辅助回放评估的声门成像数据集。
Data Brief. 2025 Feb 8;59:111376. doi: 10.1016/j.dib.2025.111376. eCollection 2025 Apr.
7
Investigation Methods for Vocal Onset-A Historical Perspective.嗓音起始的研究方法——历史视角
Bioengineering (Basel). 2024 Sep 30;11(10):989. doi: 10.3390/bioengineering11100989.
8
Deep Learning-Based Detection of Glottis Segmentation Failures.基于深度学习的声门分割失败检测
Bioengineering (Basel). 2024 Apr 30;11(5):443. doi: 10.3390/bioengineering11050443.
9
Sociodemographic reporting in videomics research: a review of practices in otolaryngology - head and neck surgery.视频分析研究中的社会人口统计学报告:耳鼻喉科学-头颈外科学实践综述。
Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6047-6056. doi: 10.1007/s00405-024-08659-0. Epub 2024 May 5.
10
DeepD3, an open framework for automated quantification of dendritic spines.DeepD3,一个用于自动量化树突棘的开放框架。
PLoS Comput Biol. 2024 Feb 29;20(2):e1011774. doi: 10.1371/journal.pcbi.1011774. eCollection 2024 Feb.
基于图像修复的连续背景建模进行声门间隙跟踪。
Med Biol Eng Comput. 2017 Dec;55(12):2123-2141. doi: 10.1007/s11517-017-1652-8. Epub 2017 May 27.
4
Prevalence of Voice Disorders in Singers: Systematic Review and Meta-Analysis.歌手嗓音疾病的患病率:系统评价与荟萃分析
J Voice. 2017 Nov;31(6):722-727. doi: 10.1016/j.jvoice.2017.02.010. Epub 2017 Mar 23.
5
Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.深度学习算法在视网膜眼底照片糖尿病视网膜病变检测中的开发与验证。
JAMA. 2016 Dec 13;316(22):2402-2410. doi: 10.1001/jama.2016.17216.
6
Oscillatory Characteristics of the Vocal Folds Across the Tenor Passaggio.男高音换声区声带的振动特征
J Voice. 2017 May;31(3):381.e5-381.e14. doi: 10.1016/j.jvoice.2016.06.015. Epub 2016 Aug 4.
7
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.用于计算机辅助检测的深度卷积神经网络:卷积神经网络架构、数据集特征与迁移学习
IEEE Trans Med Imaging. 2016 May;35(5):1285-98. doi: 10.1109/TMI.2016.2528162. Epub 2016 Feb 11.
8
Voice disorders in the elderly: A national database study.老年人嗓音障碍:一项全国数据库研究。
Laryngoscope. 2016 Feb;126(2):421-8. doi: 10.1002/lary.25511. Epub 2015 Aug 17.
9
Spatiotemporal Quantification of Vocal Fold Vibration After Exposure to Superficial Laryngeal Dehydration: A Preliminary Study.暴露于喉部表面脱水后声带振动的时空量化:一项初步研究。
J Voice. 2016 Jul;30(4):427-33. doi: 10.1016/j.jvoice.2015.07.009. Epub 2015 Aug 12.
10
Machine learning: Trends, perspectives, and prospects.机器学习:趋势、观点和展望。
Science. 2015 Jul 17;349(6245):255-60. doi: 10.1126/science.aaa8415.