传达可信意图的人类声音：一个人口统计学上多样化的语音音频数据集。

Human voices communicating trustworthy intent: A demographically diverse speech audio dataset.

作者信息

Maltezou-Papastylianou Constantina, Scherer Reinhold, Paulmann Silke

机构信息

Department of Psychology and Centre for Brain Science, University of Essex, Colchester, CO4 3SQ, UK.

Brain-Computer Interfaces and Neural Engineering Laboratory, School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK.

出版信息

Sci Data. 2025 May 31;12(1):921. doi: 10.1038/s41597-025-05267-3.

DOI:10.1038/s41597-025-05267-3

PMID:40450046

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12126476/

Abstract

The multi-disciplinary field of voice perception and trustworthiness lacks accessible and diverse speech audio datasets representing diverse speaker demographics, including age, ethnicity, and sex. Existing datasets primarily feature white, younger adult speakers, limiting generalisability. This paper introduces a novel open-access speech audio dataset with 1,152 utterances from 96 untrained speakers, across white, black and south Asian backgrounds, divided into younger (N = 60, ages 18-45) and older (N = 36, ages 60+) adults. Each speaker recorded both, their natural speech patterns (i.e. "neutral" or no intent), and their attempt to convey their trustworthy intent as they perceive it during speech production. Our dataset is described and evaluated through classification methods between neutral and trustworthy speech. Specifically, extracted acoustic and voice quality features were analysed using linear and non-linear classification models, achieving accuracies of around 70%. This dataset aims to close a crucial gap in the existing literature and provide additional research opportunities that can contribute to the generalisability and applicability of future research results in this field.

摘要

语音感知与可信度这一跨学科领域缺乏能代表不同说话者人口统计学特征（包括年龄、种族和性别）的可获取且多样的语音音频数据集。现有数据集主要以白人、年轻成年说话者为特征，限制了其通用性。本文介绍了一个新颖的开放获取语音音频数据集，该数据集包含来自96名未经训练的说话者的1152条话语，这些说话者具有白人、黑人及南亚背景，分为年轻组（N = 60，年龄18 - 45岁）和年长组（N = 36，年龄60岁以上）成年人。每位说话者都录制了他们的自然语音模式（即“中性”或无特定意图），以及他们在语音生成过程中尝试传达其认为的可信意图。我们通过中性语音和可信语音之间的分类方法对数据集进行了描述和评估。具体而言，使用线性和非线性分类模型对提取的声学和语音质量特征进行了分析，准确率达到了约70%。该数据集旨在填补现有文献中的关键空白，并提供额外的研究机会，有助于提高该领域未来研究结果的通用性和适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b959/12126476/302d780c0eaf/41597_2025_5267_Fig1_HTML.jpg

相似文献

Human voices communicating trustworthy intent: A demographically diverse speech audio dataset.传达可信意图的人类声音：一个人口统计学上多样化的语音音频数据集。

Sci Data. 2025 May 31;12(1):921. doi: 10.1038/s41597-025-05267-3.

The experience of adults who choose watchful waiting or active surveillance as an approach to medical treatment: a qualitative systematic review.选择观察等待或主动监测作为治疗方法的成年人的经历：一项定性系统评价。

JBI Database System Rev Implement Rep. 2016 Feb;14(2):174-255. doi: 10.11124/jbisrir-2016-2270.

Surveillance for Violent Deaths - National Violent Death Reporting System, 50 States, the District of Columbia, and Puerto Rico, 2022.暴力死亡监测——2022年全国暴力死亡报告系统，50个州、哥伦比亚特区和波多黎各

MMWR Surveill Summ. 2025 Jun 12;74(5):1-42. doi: 10.15585/mmwr.ss7405a1.

Effectiveness of voice rehabilitation on vocalisation in postlaryngectomy patients: a systematic review.喉切除术后患者的嗓音康复对发声效果的影响：系统评价。

Int J Evid Based Healthc. 2010 Dec;8(4):256-8. doi: 10.1111/j.1744-1609.2010.00177.x.

Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验：定性证据综合。

Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.

A systematic review of speech, language and communication interventions for children with Down syndrome from 0 to 6 years.对0至6岁唐氏综合征儿童言语、语言和沟通干预措施的系统评价。

Int J Lang Commun Disord. 2022 Mar;57(2):441-463. doi: 10.1111/1460-6984.12699. Epub 2022 Feb 22.

Factors that influence parents' and informal caregivers' views and practices regarding routine childhood vaccination: a qualitative evidence synthesis.影响父母和非正式照顾者对常规儿童疫苗接种看法和做法的因素：定性证据综合分析。

Cochrane Database Syst Rev. 2021 Oct 27;10(10):CD013265. doi: 10.1002/14651858.CD013265.pub2.

Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗：一项系统综述

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Adapting Safety Plans for Autistic Adults with Involvement from the Autism Community.在自闭症群体的参与下为成年自闭症患者调整安全计划。

Autism Adulthood. 2025 May 28;7(3):293-302. doi: 10.1089/aut.2023.0124. eCollection 2025 Jun.

Communicative Participation in Adolescents and Young Adults: A Concept Elicitation Study.青少年和青年成年人的交流参与：一项概念引出研究。

Int J Lang Commun Disord. 2025 Jul-Aug;60(4):e70069. doi: 10.1111/1460-6984.70069.

本文引用的文献

How do voice acoustics affect the perceived trustworthiness of a speaker? A systematic review.语音声学如何影响说话者的可信度感知？一项系统综述。

Front Psychol. 2025 Mar 10;16:1495456. doi: 10.3389/fpsyg.2025.1495456. eCollection 2025.

Common, uncommon, and novel applications of random forest in psychological research.随机森林在心理学研究中的常见、不常见和新颖应用。

Behav Res Methods. 2023 Aug;55(5):2447-2466. doi: 10.3758/s13428-022-01901-9. Epub 2022 Aug 1.

Mitigating white Western individualistic bias and creating more inclusive neuroscience.减轻西方白人的个人主义偏见，创造更具包容性的神经科学。

Nat Rev Neurosci. 2022 Jul;23(7):389-390. doi: 10.1038/s41583-022-00602-8.

Angry, old, male - and trustworthy? How expressive and person voice characteristics shape listener trust.愤怒、年老、男性——且值得信赖？声音特征如何塑造倾听者的信任。

PLoS One. 2020 May 4;15(5):e0232431. doi: 10.1371/journal.pone.0232431. eCollection 2020.

An Introduction to Machine Learning.机器学习简介。

Clin Pharmacol Ther. 2020 Apr;107(4):871-885. doi: 10.1002/cpt.1796. Epub 2020 Mar 3.

Effects of Vocal Intensity and Fundamental Frequency on Cepstral Peak Prominence in Patients with Voice Disorders and Vocally Healthy Controls.嗓音障碍患者和嗓音健康对照者的声强和基频对倒频谱峰值凸起的影响。

J Voice. 2021 May;35(3):411-417. doi: 10.1016/j.jvoice.2019.11.015. Epub 2019 Dec 17.

Forming social impressions from voices in native and foreign languages.从母语和外语的声音中形成社会印象。

Sci Rep. 2019 Jan 23;9(1):414. doi: 10.1038/s41598-018-36518-6.

Judgements of a speaker's personality are correlated across differing content and stimulus type.说话者的个性判断与不同的内容和刺激类型有关。

PLoS One. 2018 Oct 4;13(10):e0204991. doi: 10.1371/journal.pone.0204991. eCollection 2018.

Random forest versus logistic regression: a large-scale benchmark experiment.随机森林与逻辑回归：大规模基准实验。

BMC Bioinformatics. 2018 Jul 17;19(1):270. doi: 10.1186/s12859-018-2264-5.

Cracking the social code of speech prosody using reverse correlation.使用反向关联破解言语韵律的社会代码。

Proc Natl Acad Sci U S A. 2018 Apr 10;115(15):3972-3977. doi: 10.1073/pnas.1716090115. Epub 2018 Mar 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

传达可信意图的人类声音：一个人口统计学上多样化的语音音频数据集。

Human voices communicating trustworthy intent: A demographically diverse speech audio dataset.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献