• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从家庭录音中检测尖叫声以识别发脾气行为:使用迁移机器学习的探索性研究

Detecting Screams From Home Audio Recordings to Identify Tantrums: Exploratory Study Using Transfer Machine Learning.

作者信息

O'Donovan Rebecca, Sezgin Emre, Bambach Sven, Butter Eric, Lin Simon

机构信息

The Abigail Wexner Research Institute, Nationwide Children's Hospital, Columbus, OH, United States.

Department of Psychology, Nationwide Children's Hospital, Columbus, OH, United States.

出版信息

JMIR Form Res. 2020 Jun 16;4(6):e18279. doi: 10.2196/18279.

DOI:10.2196/18279
PMID:32459656
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7327591/
Abstract

BACKGROUND

Qualitative self- or parent-reports used in assessing children's behavioral disorders are often inconvenient to collect and can be misleading due to missing information, rater biases, and limited validity. A data-driven approach to quantify behavioral disorders could alleviate these concerns. This study proposes a machine learning approach to identify screams in voice recordings that avoids the need to gather large amounts of clinical data for model training.

OBJECTIVE

The goal of this study is to evaluate if a machine learning model trained only on publicly available audio data sets could be used to detect screaming sounds in audio streams captured in an at-home setting.

METHODS

Two sets of audio samples were prepared to evaluate the model: a subset of the publicly available AudioSet data set and a set of audio data extracted from the TV show Supernanny, which was chosen for its similarity to clinical data. Scream events were manually annotated for the Supernanny data, and existing annotations were refined for the AudioSet data. Audio feature extraction was performed with a convolutional neural network pretrained on AudioSet. A gradient-boosted tree model was trained and cross-validated for scream classification on the AudioSet data and then validated independently on the Supernanny audio.

RESULTS

On the held-out AudioSet clips, the model achieved a receiver operating characteristic (ROC)-area under the curve (AUC) of 0.86. The same model applied to three full episodes of Supernanny audio achieved an ROC-AUC of 0.95 and an average precision (positive predictive value) of 42% despite screams only making up 1.3% (n=92/7166 seconds) of the total run time.

CONCLUSIONS

These results suggest that a scream-detection model trained with publicly available data could be valuable for monitoring clinical recordings and identifying tantrums as opposed to depending on collecting costly privacy-protected clinical data for model training.

摘要

背景

用于评估儿童行为障碍的定性自我报告或家长报告往往难以收集,并且由于信息缺失、评分者偏差和有效性有限,可能会产生误导。一种数据驱动的方法来量化行为障碍可以缓解这些问题。本研究提出了一种机器学习方法来识别语音记录中的尖叫声,该方法无需收集大量临床数据进行模型训练。

目的

本研究的目的是评估仅在公开可用音频数据集上训练的机器学习模型是否可用于检测在家中捕获的音频流中的尖叫声。

方法

准备了两组音频样本以评估该模型:公开可用的AudioSet数据集的一个子集,以及从电视节目《超级保姆》中提取的一组音频数据,选择该节目是因为其与临床数据相似。对《超级保姆》数据中的尖叫事件进行了人工标注,并对AudioSet数据的现有标注进行了完善。使用在AudioSet上预训练的卷积神经网络进行音频特征提取。训练了一个梯度提升树模型,并对AudioSet数据上的尖叫分类进行交叉验证,然后在《超级保姆》音频上独立验证。

结果

在留出的AudioSet片段上,该模型的曲线下面积(AUC)达到了0.86。将相同模型应用于三集完整的《超级保姆》音频,尽管尖叫声仅占总运行时间的1.3%(n = 92 / 7166秒),但其ROC-AUC为0.95,平均精度(阳性预测值)为42%。

结论

这些结果表明,使用公开可用数据训练的尖叫检测模型对于监测临床记录和识别发脾气可能很有价值,而不是依赖于收集昂贵的受隐私保护的临床数据进行模型训练。

相似文献

1
Detecting Screams From Home Audio Recordings to Identify Tantrums: Exploratory Study Using Transfer Machine Learning.从家庭录音中检测尖叫声以识别发脾气行为:使用迁移机器学习的探索性研究
JMIR Form Res. 2020 Jun 16;4(6):e18279. doi: 10.2196/18279.
2
Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study.从众包半结构化语音记录中分类自闭症:机器学习模型比较研究。
JMIR Pediatr Parent. 2022 Apr 14;5(2):e35406. doi: 10.2196/35406.
3
Figure Correction: Detecting Screams From Home Audio Recordings to Identify Tantrums: Exploratory Study Using Transfer Machine Learning.图校正:从家庭录音中检测尖叫声以识别发脾气行为:使用迁移机器学习的探索性研究
JMIR Form Res. 2020 Jul 8;4(7):e21591. doi: 10.2196/21591.
4
Segment-Based Spotting of Bowel Sounds Using Pretrained Models in Continuous Data Streams.基于分段的肠道音点检测模型在连续数据流中的应用。
IEEE J Biomed Health Inform. 2023 Jul;27(7):3164-3174. doi: 10.1109/JBHI.2023.3269910. Epub 2023 Jun 30.
5
Detection of Diplophonation in Audio Recordings of German Standard Text Readings.检测德语标准文本朗读音频中的双声现象。
J Voice. 2019 Nov;33(6):949.e1-949.e10. doi: 10.1016/j.jvoice.2018.06.009. Epub 2018 Aug 5.
6
A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.基于机器学习的数据驱动方法预测糖尿病和心血管疾病。
BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.
7
An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition.基于声学新颖性检测的增量式类学习方法在声学事件识别中的应用。
Sensors (Basel). 2021 Oct 5;21(19):6622. doi: 10.3390/s21196622.
8
Mass detection in digital breast tomosynthesis: Deep convolutional neural network with transfer learning from mammography.数字乳腺断层合成中的肿块检测:基于乳腺X线摄影迁移学习的深度卷积神经网络
Med Phys. 2016 Dec;43(12):6654. doi: 10.1118/1.4967345.
9
Does synthetic data augmentation improve the performances of machine learning classifiers for identifying health problems in patient-nurse verbal communications in home healthcare settings?在家庭医疗环境中,合成数据增强能否提高机器学习分类器在患者-护士言语交流中识别健康问题的性能?
J Nurs Scholarsh. 2025 Jan;57(1):47-58. doi: 10.1111/jnu.13004. Epub 2024 Jul 3.
10
Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques.使用机器学习和深度学习技术评估白内障手术视频中的相位自动识别。
JAMA Netw Open. 2019 Apr 5;2(4):e191860. doi: 10.1001/jamanetworkopen.2019.1860.

本文引用的文献

1
Contactless cardiac arrest detection using smart devices.使用智能设备进行非接触式心脏骤停检测。
NPJ Digit Med. 2019 Jun 19;2:52. doi: 10.1038/s41746-019-0128-7. eCollection 2019.
2
The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.在不平衡数据集上评估二元分类器时,精确率-召回率曲线比ROC曲线更具信息性。
PLoS One. 2015 Mar 4;10(3):e0118432. doi: 10.1371/journal.pone.0118432. eCollection 2015.
3
'Do Well B.': Design Of WELL Being monitoring systems. A study protocol for the application in autism.
“健康生活B计划”:健康监测系统的设计。一项针对自闭症应用的研究方案。
BMJ Open. 2015 Feb 20;5(2):e007716. doi: 10.1136/bmjopen-2015-007716.
4
Who reports it best? A comparison between parent-report, self-report, and the real life social behaviors of adults with Williams syndrome.谁报告得最准确?威廉姆斯综合征成年人的家长报告、自我报告与现实生活社交行为的比较。
Res Dev Disabil. 2014 Dec;35(12):3276-84. doi: 10.1016/j.ridd.2014.08.011. Epub 2014 Aug 31.
5
Shedding light on a pervasive problem: a review of research on bullying experiences among children with autism spectrum disorders.揭示一个普遍存在的问题:对自闭症谱系障碍儿童欺凌经历的研究综述
J Autism Dev Disord. 2014 Jul;44(7):1520-34. doi: 10.1007/s10803-013-2011-8.
6
Objective child behavior measurement with naturalistic daylong audio recording and its application to autism identification.
Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:3708-11. doi: 10.1109/EMBC.2012.6346772.
7
Screaming, yelling, whining, and crying: categorical and intensity differences in vocal expressions of anger and sadness in children's tantrums.尖叫、大喊、哀号和哭泣:儿童发脾气时愤怒和悲伤的声音表达的类别和强度差异。
Emotion. 2011 Oct;11(5):1124-33. doi: 10.1037/a0024173.
8
Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development.自闭症、语言迟缓及正常发育儿童自然情境录音的自动语音分析。
Proc Natl Acad Sci U S A. 2010 Jul 27;107(30):13354-9. doi: 10.1073/pnas.1003882107. Epub 2010 Jul 19.
9
Relations of parental report and observation of parenting to maltreatment history.父母报告与养育方式观察与虐待史的关系。
Child Maltreat. 2006 Feb;11(1):63-75. doi: 10.1177/1077559505283589.
10
Methodological issues in interviewing and using self-report questionnaires with people with mental retardation.
Psychol Assess. 2001 Sep;13(3):319-35. doi: 10.1037//1040-3590.13.3.319.