• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 YOLO 和 TDOA 的会议中自动说话人定位。

Automatic Speaker Positioning in Meetings Based on YOLO and TDOA.

机构信息

Department of Computer Science and Engineering, Tatung University, Taipei City 104, Taiwan.

Department of Information Management, National Central University, Taoyuan City 320, Taiwan.

出版信息

Sensors (Basel). 2023 Jul 8;23(14):6250. doi: 10.3390/s23146250.

DOI:10.3390/s23146250
PMID:37514545
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10384276/
Abstract

In recent years, many things have been held via video conferences due to the impact of the COVID-19 epidemic around the world. A webcam will be used in conjunction with a computer and the Internet. However, the network camera cannot automatically turn and cannot lock the screen to the speaker. Therefore, this study uses the objection detector YOLO to capture the upper body of all people on the screen and judge whether each person opens or closes their mouth. At the same time, the Time Difference of Arrival (TDOA) is used to detect the angle of the sound source. Finally, the person's position obtained by YOLO is reversed to the person's position in the spatial coordinates through the distance between the person and the camera. Then, the spatial coordinates are used to calculate the angle between the person and the camera through inverse trigonometric functions. Finally, the angle obtained by the camera, and the angle of the sound source obtained by the microphone array, are matched for positioning. The experimental results show that the recall rate of positioning through YOLOX-Tiny reached 85.2%, and the recall rate of TDOA alone reached 88%. Integrating YOLOX-Tiny and TDOA for positioning, the recall rate reached 86.7%, the precision rate reached 100%, and the accuracy reached 94.5%. Therefore, the method proposed in this study can locate the speaker, and it has a better effect than using only one source.

摘要

近年来,由于全球 COVID-19 疫情的影响,许多事情都通过视频会议进行。网络摄像头将与计算机和互联网一起使用。然而,网络摄像头无法自动旋转,也无法将屏幕锁定到扬声器。因此,本研究使用目标检测器 YOLO 捕捉屏幕上所有人的上半身,并判断每个人是否张开或闭合嘴巴。同时,使用到达时间差(TDOA)检测声源的角度。最后,通过 YOLO 获得的人的位置通过人与摄像头之间的距离被反转到空间坐标中的人的位置。然后,通过反三角函数计算人与摄像头之间的空间坐标的角度。最后,将摄像头获得的角度与麦克风阵列获得的声源角度进行匹配以进行定位。实验结果表明,通过 YOLOX-Tiny 进行定位的召回率达到 85.2%,而单独使用 TDOA 的召回率达到 88%。将 YOLOX-Tiny 和 TDOA 集成进行定位,召回率达到 86.7%,准确率达到 100%,准确率达到 94.5%。因此,本研究提出的方法可以定位说话人,并且比仅使用一个声源的效果更好。

相似文献

1
Automatic Speaker Positioning in Meetings Based on YOLO and TDOA.基于 YOLO 和 TDOA 的会议中自动说话人定位。
Sensors (Basel). 2023 Jul 8;23(14):6250. doi: 10.3390/s23146250.
2
Arbitrary Microphone Array Optimization Method Based on TDOA for Specific Localization Scenarios.基于 TDOA 的特定定位场景下任意麦克风阵列优化方法。
Sensors (Basel). 2019 Oct 7;19(19):4326. doi: 10.3390/s19194326.
3
Optimization of Time Synchronization and Algorithms with TDOA Based Indoor Positioning Technique for Internet of Things.基于到达时间差(TDOA)的物联网室内定位技术的时间同步与算法优化
Sensors (Basel). 2020 Nov 14;20(22):6513. doi: 10.3390/s20226513.
4
Range-Extension Algorithms and Strategies for TDOA Ultra-Wideband Positioning System.到达时间差(TDOA)超宽带定位系统的距离扩展算法和策略。
Sensors (Basel). 2023 Mar 13;23(6):3088. doi: 10.3390/s23063088.
5
Acoustic Indoor Localization System Integrating TDMA+FDMA Transmission Scheme and Positioning Correction Technique.集成时分多址(TDMA)+频分多址(FDMA)传输方案与定位校正技术的室内声学定位系统
Sensors (Basel). 2019 May 22;19(10):2353. doi: 10.3390/s19102353.
6
Multipath Map Method for TDOA Based Indoor Reverse Positioning System with Improved Chan-Taylor Algorithm.基于改进Chan-Taylor算法的TDOA室内反向定位系统的多径映射方法
Sensors (Basel). 2020 Jun 5;20(11):3223. doi: 10.3390/s20113223.
7
Multisensory Fusion for Unsupervised Spatiotemporal Speaker Diarization.用于无监督时空说话人分离的多感官融合
Sensors (Basel). 2024 Jun 29;24(13):4229. doi: 10.3390/s24134229.
8
Method for Remote Determination of Object Coordinates in Space Based on Exact Analytical Solution of Hyperbolic Equations.基于双曲方程精确解析解的空间物体坐标远程测定方法。
Sensors (Basel). 2020 Sep 24;20(19):5472. doi: 10.3390/s20195472.
9
A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models.一种基于YOLO的从视频中进行实时人群检测的新方法及YOLO模型的性能分析。
J Real Time Image Process. 2023;20(1):5. doi: 10.1007/s11554-023-01276-w. Epub 2023 Jan 30.
10
Accuracy Analysis in Sensor Networks for Asynchronous Positioning Methods.用于异步定位方法的传感器网络中的准确性分析
Sensors (Basel). 2019 Jul 9;19(13):3024. doi: 10.3390/s19133024.

本文引用的文献

1
Real-Time Abnormal Object Detection for Video Surveillance in Smart Cities.智慧城市视频监控中的实时异常目标检测。
Sensors (Basel). 2022 May 19;22(10):3862. doi: 10.3390/s22103862.
2
MRLIHT: Mobile RFID-based Localization for Indoor Human Tracking.MRLIHT:基于移动 RFID 的室内人体跟踪定位。
Sensors (Basel). 2020 Mar 19;20(6):1711. doi: 10.3390/s20061711.
3
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.