• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

法庭自动语音比对中的数据策略。

Data strategies in forensic automatic speaker comparison.

机构信息

Netherlands Forensic Institute, Laan van Ypenburg 6, 2497 GB The Hague, the Netherlands.

出版信息

Forensic Sci Int. 2023 Sep;350:111790. doi: 10.1016/j.forsciint.2023.111790. Epub 2023 Jul 20.

DOI:10.1016/j.forsciint.2023.111790
PMID:37567041
Abstract

Automatic speaker recognition (ASR) is a method used in forensic speaker comparison (FSC) casework. It needs collections of audio data that are representative of the case audio in order to perform reference normalization and to train a score-to-LR function. Audio from a certain minimum number of speakers is needed for each of those purposes to obtain relatively stable performance of ASR. Although it is not possible to set a hard cut-off, for the purpose of this work this number was chosen to be 30 for each, and 60 for both. Lack of representative data from that many speakers and uncertainty about what exactly constitutes representative data are major reasons for not employing ASR in FSC. An experiment was carried out in which a situation was simulated where a practitioner has only 30 speakers available. Several data strategies are tried out to handle the lack of data: leaving out reference normalization, splitting the 30 speakers into two groups of 15 (ignoring the minimum of 30) and a leave 1 or 2 out strategy where all 30 speakers are used for both reference normalization and calibration. They are compared to the baseline situation where the practitioner does have the required 60 speakers. The leave 1 or 2 out strategy with 30 speakers performs on par with baseline, and extension of that strategy to the full 60 speakers even outperforms baseline. This shows that a strategy that halves the data need is viable, lessening the data requirements for ASR in FSC and making the use of ASR possible in more cases.

摘要

自动说话人识别 (ASR) 是法庭科学说话人比较 (FSC) 工作中使用的一种方法。为了执行参考归一化和训练分数到 LR 函数,它需要收集具有代表性的音频数据。为了实现 ASR 相对稳定的性能,每个目的都需要来自一定数量的说话者的音频。虽然不可能设置硬性截止值,但为了本工作的目的,这个数量被选为每个目的 30,两个目的共 60。缺乏来自那么多说话者的代表性数据以及不确定什么是代表性数据,是 FSC 中不采用 ASR 的主要原因。进行了一项实验,模拟了从业人员只有 30 个说话者可用的情况。尝试了几种数据策略来处理数据不足的问题:省略参考归一化,将 30 个说话者分成两组,每组 15 个(忽略最小数量 30 个),以及采用 leave 1 或 2 out 策略,其中所有 30 个说话者都用于参考归一化和校准。将它们与从业人员确实有需要的 60 个说话者的基线情况进行比较。使用 30 个说话者的 leave 1 或 2 out 策略与基线表现相当,并且将该策略扩展到全部 60 个说话者甚至超过基线。这表明减半数据需求的策略是可行的,可以减少 FSC 中 ASR 的数据需求,并使 ASR 在更多情况下成为可能。

相似文献

1
Data strategies in forensic automatic speaker comparison.法庭自动语音比对中的数据策略。
Forensic Sci Int. 2023 Sep;350:111790. doi: 10.1016/j.forsciint.2023.111790. Epub 2023 Jul 20.
2
Effects of language mismatch in automatic forensic voice comparison using deep learning embeddings.使用深度学习嵌入进行自动法医语音比对中的语言不匹配的影响。
J Forensic Sci. 2023 May;68(3):871-883. doi: 10.1111/1556-4029.15250. Epub 2023 Mar 31.
3
On compensation of mismatched recording conditions in the Bayesian approach for forensic automatic speaker recognition.关于法医自动说话人识别贝叶斯方法中不匹配记录条件的补偿
Forensic Sci Int. 2004 Dec 2;146 Suppl:S101-6. doi: 10.1016/j.forsciint.2004.09.032.
4
Speaker identification in courtroom contexts - Part I: Individual listeners compared to forensic voice comparison based on automatic-speaker-recognition technology.法庭环境中的说话人识别 - 第一部分:个体听众与基于自动说话人识别技术的法庭语音比对比较。
Forensic Sci Int. 2022 Dec;341:111499. doi: 10.1016/j.forsciint.2022.111499. Epub 2022 Oct 15.
5
Euclidean Distances as measures of speaker similarity including identical twin pairs: A forensic investigation using source and filter voice characteristics.作为说话者相似度度量的欧几里得距离,包括同卵双胞胎对:一项使用源和滤波器语音特征的法医调查。
Forensic Sci Int. 2017 Jan;270:25-38. doi: 10.1016/j.forsciint.2016.11.020. Epub 2016 Nov 17.
6
Modeling speech imitation and ecological learning of auditory-motor maps.建模听觉-运动图谱的言语模仿和生态学习。
Front Psychol. 2013 Jun 27;4:364. doi: 10.3389/fpsyg.2013.00364. Print 2013.
7
The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings.已知说话者录音与相关人群样本录音之间缺乏校准以及条件不匹配对法医语音比较的影响。
Forensic Sci Int. 2018 Feb;283:e1-e7. doi: 10.1016/j.forsciint.2017.12.024. Epub 2017 Dec 19.
8
Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.帕金森病噪声环境下言语可懂度的自动评估:验证研究。
J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.
9
Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features.基于高斯混合模型、倒谱分析和遗传选择独特特征的自动说话人识别系统。
Sensors (Basel). 2022 Dec 1;22(23):9370. doi: 10.3390/s22239370.
10
Forensic Speaker Verification Using Ordinary Least Squares.基于最小二乘法的法庭语音验证
Sensors (Basel). 2019 Oct 10;19(20):4385. doi: 10.3390/s19204385.