基于卷积神经网络的智能家居中社交机器人摄像头下尴尬情境检测

Convolutional Neural Network-Based Embarrassing Situation Detection under Camera for Social Robot in Smart Homes.

机构信息

Key Laboratory of Advanced Manufacturing Technology of Ministry of Education, Guizhou University, Guiyang 550025, China.

School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74074, USA.

出版信息

Sensors (Basel). 2018 May 12;18(5):1530. doi: 10.3390/s18051530.

DOI:10.3390/s18051530

PMID:29757211

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5982546/

Abstract

Recent research has shown that the ubiquitous use of cameras and voice monitoring equipment in a home environment can raise privacy concerns and affect human mental health. This can be a major obstacle to the deployment of smart home systems for elderly or disabled care. This study uses a social robot to detect embarrassing situations. Firstly, we designed an improved neural network structure based on the You Only Look Once (YOLO) model to obtain feature information. By focusing on reducing area redundancy and computation time, we proposed a bounding-box merging algorithm based on region proposal networks (B-RPN), to merge the areas that have similar features and determine the borders of the bounding box. Thereafter, we designed a feature extraction algorithm based on our improved YOLO and B-RPN, called F-YOLO, for our training datasets, and then proposed a real-time object detection algorithm based on F-YOLO (RODA-FY). We implemented RODA-FY and compared models on our MAT social robot. Secondly, we considered six types of situations in smart homes, and developed training and validation datasets, containing 2580 and 360 images, respectively. Meanwhile, we designed three types of experiments with four types of test datasets composed of 960 sample images. Thirdly, we analyzed how a different number of training iterations affects our prediction estimation, and then we explored the relationship between recognition accuracy and learning rates. Our results show that our proposed privacy detection system can recognize designed situations in the smart home with an acceptable recognition accuracy of 94.48%. Finally, we compared the results among RODA-FY, Inception V3, and YOLO, which indicate that our proposed RODA-FY outperforms the other comparison models in recognition accuracy.

摘要

最近的研究表明，家庭环境中无处不在的摄像头和语音监控设备可能会引发隐私问题，并影响人类心理健康。这可能是部署智能家居系统为老年人或残疾人提供护理的主要障碍。本研究使用社交机器人来检测尴尬情况。首先，我们设计了一种基于 You Only Look Once（YOLO）模型的改进神经网络结构，以获取特征信息。通过关注减少区域冗余和计算时间，我们提出了一种基于区域提议网络（B-RPN）的边界框合并算法，以合并具有相似特征的区域，并确定边界框的边界。此后，我们设计了一种基于改进的 YOLO 和 B-RPN 的特征提取算法，称为 F-YOLO，用于我们的训练数据集，然后提出了一种基于 F-YOLO 的实时目标检测算法（RODA-FY）。我们在 MAT 社交机器人上实现了 RODA-FY 并对模型进行了比较。其次，我们考虑了智能家居中的六种情况，并开发了训练和验证数据集，分别包含 2580 张和 360 张图像。同时，我们设计了三种类型的实验，使用由 960 个样本图像组成的四种类型的测试数据集。第三，我们分析了不同数量的训练迭代如何影响我们的预测估计，然后探讨了识别精度和学习率之间的关系。我们的结果表明，我们提出的隐私检测系统可以识别智能家居中设计的情况，识别精度为 94.48%。最后，我们比较了 RODA-FY、Inception V3 和 YOLO 的结果，表明我们提出的 RODA-FY 在识别精度方面优于其他比较模型。