The Ohio State University University Wexner Medical Center, Columbus, OH, United States.
Nationwide Children's Hospital, Columbus, OH, United States.
JMIR Mhealth Uhealth. 2021 Jan 11;9(1):e24045. doi: 10.2196/24045.
A voice assistant (VA) is inanimate audio-interfaced software augmented with artificial intelligence, capable of 2-way dialogue, and increasingly used to access health care advice. Postpartum depression (PPD) is a common perinatal mood disorder with an annual estimated cost of $14.2 billion. Only a small percentage of PPD patients seek care due to lack of screening and insufficient knowledge of the disease, and this is, therefore, a prime candidate for a VA-based digital health intervention.
In order to understand the capability of VAs, our aim was to assess VA responses to PPD questions in terms of accuracy, verbal response, and clinically appropriate advice given.
This cross-sectional study examined four VAs (Apple Siri, Amazon Alexa, Google Assistant, and Microsoft Cortana) installed on two mobile devices in early 2020. We posed 14 questions to each VA that were retrieved from the American College of Obstetricians and Gynecologists (ACOG) patient-focused Frequently Asked Questions (FAQ) on PPD. We scored the VA responses according to accuracy of speech recognition, presence of a verbal response, and clinically appropriate advice in accordance with ACOG FAQ, which were assessed by two board-certified physicians.
Accurate recognition of the query ranged from 79% to 100%. Verbal response ranged from 36% to 79%. If no verbal response was given, queries were treated like a web search between 33% and 89% of the time. Clinically appropriate advice given by VA ranged from 14% to 29%. We compared the category proportions using the Fisher exact test. No single VA statistically outperformed other VAs in the three performance categories. Additional observations showed that two VAs (Google Assistant and Microsoft Cortana) included advertisements in their responses.
While the best performing VA gave clinically appropriate advice to 29% of the PPD questions, all four VAs taken together achieved 64% clinically appropriate advice. All four VAs performed well in accurately recognizing a PPD query, but no VA achieved even a 30% threshold for providing clinically appropriate PPD information. Technology companies and clinical organizations should partner to improve guidance, screen patients for mental health disorders, and educate patients on potential treatment.
语音助手(VA)是一种具有人工智能功能的、可进行双向对话的、越来越多被用于获取医疗保健建议的非生命音频接口软件。产后抑郁症(PPD)是一种常见的围产期情绪障碍,每年估计造成 142 亿美元的损失。由于缺乏筛查和对疾病的了解不足,只有一小部分 PPD 患者寻求治疗,因此,VA 是基于数字健康干预的一个主要候选者。
为了了解 VAs 的能力,我们的目的是评估 VA 对 PPD 问题的回答在准确性、口头回答和提供临床适当建议方面的表现。
这项横断面研究于 2020 年初检查了安装在两台移动设备上的四个 VA(苹果 Siri、亚马逊 Alexa、谷歌助手和微软 Cortana)。我们向每个 VA 提出了 14 个问题,这些问题取自美国妇产科医师学会(ACOG)以患者为中心的产后抑郁症常见问题解答(FAQ)。我们根据语音识别的准确性、口头回答的存在以及与 ACOG FAQ 相符的临床适当建议对 VA 回答进行评分,由两名董事会认证的医生进行评估。
查询的准确识别率从 79%到 100%不等。口头回答率从 36%到 79%不等。如果没有口头回答,查询将在 33%到 89%的时间内被视为网络搜索。VA 提供的临床适当建议率从 14%到 29%不等。我们使用 Fisher 精确检验比较了类别比例。没有一个 VA 在三个性能类别中都明显优于其他 VA。进一步的观察表明,有两个 VA(谷歌助手和微软 Cortana)在其回复中包含了广告。
尽管表现最好的 VA 对 29%的 PPD 问题提供了临床适当的建议,但四个 VA 综合起来提供了 64%的临床适当建议。四个 VA 在准确识别 PPD 查询方面表现良好,但没有一个 VA 达到提供临床适当 PPD 信息的 30%的阈值。科技公司和临床组织应合作改进指导,为精神健康障碍患者筛查,并教育患者潜在的治疗方法。