• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过戴口罩读唇实现远程射频感应的极限突破。

Pushing the limits of remote RF sensing by reading lips under the face mask.

机构信息

University of Glasgow, James Watt School of Engineering, Glasgow, G12 8QQ, UK.

School of Computing, Engineering and Built Environment, Glasgow Caledonian University, Glasgow, G4 0BA, UK.

出版信息

Nat Commun. 2022 Sep 7;13(1):5168. doi: 10.1038/s41467-022-32231-1.

DOI:10.1038/s41467-022-32231-1
PMID:36071056
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9452506/
Abstract

The problem of Lip-reading has become an important research challenge in recent years. The goal is to recognise speech from lip movements. Most of the Lip-reading technologies developed so far are camera-based, which require video recording of the target. However, these technologies have well-known limitations of occlusion and ambient lighting with serious privacy concerns. Furthermore, vision-based technologies are not useful for multi-modal hearing aids in the coronavirus (COVID-19) environment, where face masks have become a norm. This paper aims to solve the fundamental limitations of camera-based systems by proposing a radio frequency (RF) based Lip-reading framework, having an ability to read lips under face masks. The framework employs Wi-Fi and radar technologies as enablers of RF sensing based Lip-reading. A dataset comprising of vowels A, E, I, O, U and empty (static/closed lips) is collected using both technologies, with a face mask. The collected data is used to train machine learning (ML) and deep learning (DL) models. A high classification accuracy of 95% is achieved on the Wi-Fi data utilising neural network (NN) models. Moreover, similar accuracy is achieved by VGG16 deep learning model on the collected radar-based dataset.

摘要

唇语识别问题近年来成为一个重要的研究挑战。目标是从唇动中识别语音。到目前为止,大多数开发的唇语识别技术都是基于摄像机的,这需要对目标进行视频录制。然而,这些技术存在众所周知的遮挡和环境光照限制,并且存在严重的隐私问题。此外,基于视觉的技术对于冠状病毒 (COVID-19) 环境中的多模态助听器没有用处,因为口罩已经成为常态。本文旨在通过提出一种基于射频 (RF) 的唇语识别框架来解决基于摄像机系统的基本限制,该框架具有在戴口罩的情况下读取嘴唇的能力。该框架采用 Wi-Fi 和雷达技术作为基于 RF 感应的唇语识别的推动者。使用这两种技术收集了包含元音 A、E、I、O、U 和空(静态/闭合嘴唇)的数据集,同时使用了口罩。所收集的数据用于训练机器学习 (ML) 和深度学习 (DL) 模型。利用神经网络 (NN) 模型,Wi-Fi 数据的分类准确率达到 95%。此外,基于收集的雷达数据集的 VGG16 深度学习模型也实现了类似的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/d8670bd42d80/41467_2022_32231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/65f75f62ac63/41467_2022_32231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/9f3b3b16c6b7/41467_2022_32231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/63fd977d8db1/41467_2022_32231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/d8670bd42d80/41467_2022_32231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/65f75f62ac63/41467_2022_32231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/9f3b3b16c6b7/41467_2022_32231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/63fd977d8db1/41467_2022_32231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3abe/9452506/d8670bd42d80/41467_2022_32231_Fig4_HTML.jpg

相似文献

1
Pushing the limits of remote RF sensing by reading lips under the face mask.通过戴口罩读唇实现远程射频感应的极限突破。
Nat Commun. 2022 Sep 7;13(1):5168. doi: 10.1038/s41467-022-32231-1.
2
[Development and evaluation of a deep learning algorithm for German word recognition from lip movements].[一种用于从唇动识别德语单词的深度学习算法的开发与评估]
HNO. 2022 Jun;70(6):456-465. doi: 10.1007/s00106-021-01143-9. Epub 2022 Jan 13.
3
Toward Realigning Automatic Speaker Verification in the Era of COVID-19.面向新冠疫情时代的自动说话人验证技术的再调整。
Sensors (Basel). 2022 Mar 30;22(7):2638. doi: 10.3390/s22072638.
4
Improving Speech Recognition Performance in Noisy Environments by Enhancing Lip Reading Accuracy.通过提高唇读准确率来提高噪声环境下的语音识别性能。
Sensors (Basel). 2023 Feb 11;23(4):2053. doi: 10.3390/s23042053.
5
Speaking with mask in the COVID-19 era: Multiclass machine learning classification of acoustic and perceptual parameters.在 COVID-19 时代戴口罩说话:声学和感知参数的多类机器学习分类。
J Acoust Soc Am. 2023 Feb;153(2):1204. doi: 10.1121/10.0017244.
6
Influence of surgical and N95 face masks on speech perception and listening effort in noise.手术口罩和 N95 口罩对噪声环境下言语感知和听力努力的影响。
PLoS One. 2021 Jul 1;16(7):e0253874. doi: 10.1371/journal.pone.0253874. eCollection 2021.
7
Face Masks During the COVID-19 Pandemic: A Simple Protection Tool With Many Meanings.口罩在 COVID-19 大流行期间:一个具有多重意义的简单防护工具。
Front Public Health. 2021 Jan 13;8:606635. doi: 10.3389/fpubh.2020.606635. eCollection 2020.
8
AI-based face mask detection system: a straightforward proposition to fight with Covid-19 situation.基于人工智能的口罩检测系统:应对新冠疫情的一个简单方案。
Multimed Tools Appl. 2023;82(9):13241-13273. doi: 10.1007/s11042-022-13697-z. Epub 2022 Sep 8.
9
Acoustic characteristics of fricatives, amplitude of formants and clarity of speech produced without and with a medical mask.摩擦音的声学特征、共振峰幅度和使用与不使用医用口罩说话的清晰度。
Int J Lang Commun Disord. 2022 Mar;57(2):366-380. doi: 10.1111/1460-6984.12705. Epub 2022 Feb 15.
10
GANMasker: A Two-Stage Generative Adversarial Network for High-Quality Face Mask Removal.GANMasker:一种用于高质量口罩去除的两阶段生成对抗网络。
Sensors (Basel). 2023 Aug 10;23(16):7094. doi: 10.3390/s23167094.

引用本文的文献

1
Unsupervised Clustering and Ensemble Learning for Classifying Lip Articulation in Fingerspelling.用于手指拼写中唇音清晰度分类的无监督聚类与集成学习
Sensors (Basel). 2025 Jun 13;25(12):3703. doi: 10.3390/s25123703.
2
A guidance to intelligent metamaterials and metamaterials intelligence.智能超材料与超材料智能指南。
Nat Commun. 2025 Jan 29;16(1):1154. doi: 10.1038/s41467-025-56122-3.
3
Artificial intelligence enabled smart mask for speech recognition for future hearing devices.用于未来听力设备语音识别的人工智能智能口罩。

本文引用的文献

1
Decoding lip language using triboelectric sensors with deep learning.使用带有深度学习的摩擦电传感器解码唇语。
Nat Commun. 2022 Mar 17;13(1):1401. doi: 10.1038/s41467-022-29083-0.
2
5G-enabled contactless multi-user presence and activity detection for independent assisted living.用于独立辅助生活的 5G 支持的非接触式多用户存在和活动检测。
Sci Rep. 2021 Sep 2;11(1):17590. doi: 10.1038/s41598-021-96689-7.
3
An Intelligent Non-Invasive Real-Time Human Activity Recognition System for Next-Generation Healthcare.面向下一代医疗保健的智能非侵入式实时人体活动识别系统。
Sci Rep. 2024 Dec 3;14(1):30112. doi: 10.1038/s41598-024-81904-y.
4
Microwave Speech Recognizer Empowered by a Programmable Metasurface.基于可编程超表面的微波语音识别器。
Adv Sci (Weinh). 2024 May;11(17):e2309826. doi: 10.1002/advs.202309826. Epub 2024 Feb 21.
5
Wide-range soft anisotropic thermistor with a direct wireless radio frequency interface.具有直接无线射频接口的宽范围软质各向异性热敏电阻。
Nat Commun. 2024 Jan 11;15(1):452. doi: 10.1038/s41467-024-44735-z.
6
A comprehensive multimodal dataset for contactless lip reading and acoustic analysis.用于非接触式唇读和声学分析的综合多模态数据集。
Sci Data. 2023 Dec 13;10(1):895. doi: 10.1038/s41597-023-02793-w.
7
Computer-Vision Based Gesture-Metasurface Interaction System for Beam Manipulation and Wireless Communication.基于计算机视觉的用于波束操控和无线通信的手势超表面交互系统。
Adv Sci (Weinh). 2024 Feb;11(5):e2305152. doi: 10.1002/advs.202305152. Epub 2023 Dec 3.
Sensors (Basel). 2020 May 6;20(9):2653. doi: 10.3390/s20092653.
4
Privacy-Preserving Non-Wearable Occupancy Monitoring System Exploiting Wi-Fi Imaging for Next-Generation Body Centric Communication.利用Wi-Fi成像技术实现下一代以身体为中心通信的隐私保护非可穿戴式占用监测系统。
Micromachines (Basel). 2020 Apr 3;11(4):379. doi: 10.3390/mi11040379.