Suppr超能文献

用于阻塞性睡眠呼吸暂停诊断的机器听觉:一项贝叶斯荟萃分析。

Machine Listening for OSA Diagnosis: A Bayesian Meta-Analysis.

作者信息

Tan Benjamin Kye Jyn, Gao Esther Yanxin, Tan Nicole Kye Wen, Yeo Brian Sheng Yep, Tan Claire Jing-Wen, Ng Adele Chin Wei, Leong Zhou Hao, Phua Chu Qin, Uataya Maythad, Goh Liang Chye, Ong Thun How, Leow Leong Chai, Huang Guang-Bin, Toh Song Tar

机构信息

Department of Otorhinolaryngology-Head & Neck Surgery, Singapore General Hospital, Singapore; School of Computing and Information, University of Pittsburgh, Pittsburgh, PA; Yong Loo Lin School of Medicine, National University of Singapore, Singapore; SingHealth Duke-NUS Sleep Centre, SingHealth, Singapore; Surgery Academic Clinical Program, SingHealth, Singapore.

Department of Otorhinolaryngology-Head & Neck Surgery, Singapore General Hospital, Singapore; Yong Loo Lin School of Medicine, National University of Singapore, Singapore; SingHealth Duke-NUS Sleep Centre, SingHealth, Singapore.

出版信息

Chest. 2025 Aug;168(2):520-530. doi: 10.1016/j.chest.2025.04.006. Epub 2025 Apr 11.

Abstract

BACKGROUND

Among 1 billion patients worldwide with OSA, 90% remain undiagnosed. The main barrier to diagnosis is the overnight polysomnogram, which requires specialized equipment, skilled technicians, and inpatient beds available only in tertiary sleep centers. Recent advances in artificial intelligence (AI) have enabled OSA detection using breathing sound recordings.

RESEARCH QUESTION

What is the diagnostic accuracy of and how can we optimize machine listening for OSA?

STUDY DESIGN AND METHODS

PubMed, Embase, Scopus, Web of Science, and IEEE Xplore databases were systematically searched. Two masked reviewers selected studies comparing the patient-level diagnostic performance of AI approaches using overnight audio recordings vs conventional diagnosis (apnea-hypopnea index) using a train-test split or k-fold cross-validation. Bayesian bivariate meta-analysis and meta-regression were performed. Publication bias was assessed by using a selection model. Risk of bias and evidence quality were assessed by using the Quality Assessment of Diagnostic Accuracy Studies-2 and the Grading of Recommendations, Assessment, Development, and Evaluation tools.

RESULTS

From 6,254 records, 16 studies (41 models) trained on 4,864 participants and tested on 2,370 participants were included. No study had a high risk of bias. Machine listening achieved a pooled sensitivity (95% credible interval) of 90.3% (86.9%-93.1%), a specificity of 86.7% (83.1%-89.7%), a diagnostic OR of 60.8 (39.4-99.9), and positive and negative likelihood ratios of 6.78 (5.34-8.85) and 0.113 (0.079-0.152), respectively. At apnea-hypopnea index cutoffs of ≥ 5, ≥ 15, and ≥ 30 events per hour, sensitivities were 94.3% (90.3%-96.8%), 86.3% (80.1%-90.9%), and 86.3% (79.2%-91.1%); and specificities were 78.5% (68.0%-86.9%), 87.3% (81.8%-91.3%), and 89.5% (84.8%-93.3%). Meta-regression identified increased sensitivity for the following: higher audio sampling frequencies, non-contact microphones, higher OSA prevalence, and train-test split model evaluation. Accuracy was equal regardless of home smartphone vs in-laboratory professional microphone recordings, deep learning vs traditional machine learning, and variations in age and sex. Publication bias was not evident, and the evidence was of high quality.

INTERPRETATION

In this study, machine listening achieved excellent diagnostic accuracy, superior to the STOP-Bang (snoring, tiredness, observed apnea, BP, BMI, age, neck size, gender) questionnaire and comparable to common home sleep tests. Digital medicine should be further explored and externally validated for accessible and equitable OSA diagnosis.

CLINICAL TRIAL REGISTRATION

PROSPERO database; No.: CRD42024534235; URL: https://www.crd.york.ac.uk/PROSPERO/).

摘要

背景

全球10亿阻塞性睡眠呼吸暂停(OSA)患者中,90%仍未被诊断出来。诊断的主要障碍是夜间多导睡眠图,这需要专门的设备、技术熟练的技术人员以及只有三级睡眠中心才有的住院床位。人工智能(AI)的最新进展使得利用呼吸声记录来检测OSA成为可能。

研究问题

OSA的机器听诊诊断准确性如何,以及如何优化?

研究设计与方法

系统检索了PubMed、Embase、Scopus、Web of Science和IEEE Xplore数据库。两名盲法评审员选择了比较使用夜间音频记录的人工智能方法与使用训练-测试分割或k折交叉验证的传统诊断(呼吸暂停低通气指数)的患者水平诊断性能的研究。进行了贝叶斯双变量荟萃分析和荟萃回归。使用选择模型评估发表偏倚。使用诊断准确性研究质量评估-2和推荐分级、评估、制定与评价工具评估偏倚风险和证据质量。

结果

从6254条记录中,纳入了16项研究(41个模型),这些研究对4864名参与者进行了训练,并对2370名参与者进行了测试。没有研究存在高偏倚风险。机器听诊的合并敏感性(95%可信区间)为90.3%(86.9%-93.1%),特异性为86.7%(83.1%-89.7%),诊断比值比为60.8(39.4-99.9),阳性和阴性似然比分别为6.78(5.34-8.85)和0.113(0.079-0.152)。在每小时呼吸暂停低通气指数阈值≥5、≥15和≥30次事件时,敏感性分别为94.3%(90.3%-96.8%)、86.3%(8%0.1%-90.9%)和86.3%(79.2%-91.1%);特异性分别为78.5%(68.0%-86.9%)、87.3%(81.8%-91.3%)和89.5%(84.8%-93.3%)。荟萃回归确定了以下因素可提高敏感性:更高的音频采样频率、非接触式麦克风、更高的OSA患病率以及训练-测试分割模型评估。无论使用家用智能手机还是实验室专业麦克风录音、深度学习还是传统机器学习,以及年龄和性别的差异,准确性都是相同的。未发现明显的发表偏倚,证据质量高。

解读

在本研究中,机器听诊实现了出色的诊断准确性,优于STOP-Bang问卷(打鼾、疲劳、观察到的呼吸暂停、血压、体重指数、年龄、颈围、性别),与常见的家庭睡眠测试相当。应进一步探索数字医学并进行外部验证,以实现可及且公平的OSA诊断。

临床试验注册

PROSPERO数据库;编号:CRD42024534235;网址:https://www.crd.york.ac.uk/PROSPERO/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f84/12405919/6a8a8dd24469/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验