从短单目视频序列中学习水果和蔬菜的公制体积估计。

Learning metric volume estimation of fruits and vegetables from short monocular video sequences.

作者信息

Steinbrener Jan, Dimitrievska Vesna, Pittino Federico, Starmans Frans, Waldner Roland, Holzbauer Jürgen, Arnold Thomas

机构信息

Control of Networked Systems Group, University of Klagenfurt, Universitaetsstr. 65- 67, Klagenfurt, 9020, Carinthia, Austria.

Silicon Austria Labs GmbH, Europastraße 12, Villach, 9524, Carinthia, Austria.

出版信息

Heliyon. 2023 Mar 21;9(4):e14722. doi: 10.1016/j.heliyon.2023.e14722. eCollection 2023 Apr.

DOI:10.1016/j.heliyon.2023.e14722

PMID:37035347

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10073754/

Abstract

We present a novel approach for extracting metric volume information of fruits and vegetables from short monocular video sequences and associated inertial data recorded with a hand-held smartphone. Estimated segmentation masks from a pre-trained object detector are fused with the predicted change in relative pose obtained from the inertial data to predict the class and volume of the objects of interest. Our approach works with simple RGB video frames and inertial data which are readily available from modern smartphones. It does not require reference objects of known size in the video frames. Using a balanced validation dataset, we achieve a classification accuracy of 95% and a mean absolute percentage error for the volume prediction of 16% on untrained objects, which is comparable to state-of-the-art results requiring more elaborated data recording setups. A very accurate estimation of the model uncertainty is achieved through ensembling and the use of Gaussian negative log-likelihood loss. The dataset used in our experiments including ground-truth volume information is available at https://sst.aau.at/cns/datasets.

摘要

我们提出了一种新颖的方法，用于从手持智能手机记录的短单目视频序列和相关惯性数据中提取水果和蔬菜的度量体积信息。预训练目标检测器估计的分割掩码与从惯性数据获得的相对姿态预测变化相融合，以预测感兴趣对象的类别和体积。我们的方法适用于简单的RGB视频帧和惯性数据，这些数据可从现代智能手机轻松获取。它不需要视频帧中已知大小的参考对象。使用平衡验证数据集，我们在未训练对象上实现了95%的分类准确率和16%的体积预测平均绝对百分比误差，这与需要更复杂数据记录设置的现有技术结果相当。通过集成和使用高斯负对数似然损失，实现了对模型不确定性的非常准确的估计。我们实验中使用的包含地面真值体积信息的数据集可在https://sst.aau.at/cns/datasets获取。

相似文献

Learning metric volume estimation of fruits and vegetables from short monocular video sequences.从短单目视频序列中学习水果和蔬菜的公制体积估计。

Heliyon. 2023 Mar 21;9(4):e14722. doi: 10.1016/j.heliyon.2023.e14722. eCollection 2023 Apr.

EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos.内镜 SLAM 数据集和一种用于内镜视频的无监督单目视觉里程计和深度估计方法。

Med Image Anal. 2021 Jul;71:102058. doi: 10.1016/j.media.2021.102058. Epub 2021 Apr 15.

Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor.基于单目相机传感器的深度、姿势、地面法向量和地面分割的联合无监督学习。

Sensors (Basel). 2020 Jul 3;20(13):3737. doi: 10.3390/s20133737.

Object Pose Estimation Using Edge Images Synthesized from Shape Information.基于形状信息生成的边缘图像的目标姿态估计。

Sensors (Basel). 2022 Dec 8;22(24):9610. doi: 10.3390/s22249610.

Monocular Depth Estimation with Augmented Ordinal Depth Relationships.基于增强序数深度关系的单目深度估计

IEEE Trans Image Process. 2018 Oct 24. doi: 10.1109/TIP.2018.2877944.

SelfVIO: Self-supervised deep monocular Visual-Inertial Odometry and depth estimation.SelfVIO：自监督深度单目视觉惯性里程计和深度估计。

Neural Netw. 2022 Jun;150:119-136. doi: 10.1016/j.neunet.2022.03.005. Epub 2022 Mar 10.

Towards markerless surgical tool and hand pose estimation.面向无标记手术工具和手部姿势估计。

Int J Comput Assist Radiol Surg. 2021 May;16(5):799-808. doi: 10.1007/s11548-021-02369-2. Epub 2021 Apr 21.

Development and Assessment of an Artificial Intelligence-Based Tool for Ptosis Measurement in Adult Myasthenia Gravis Patients Using Selfie Video Clips Recorded on Smartphones.基于人工智能的工具用于通过智能手机录制的自拍视频片段测量成年重症肌无力患者上睑下垂的开发与评估

Digit Biomark. 2023 Jul 28;7(1):63-73. doi: 10.1159/000531224. eCollection 2023 Jan-Dec.

WE3DS: An RGB-D Image Dataset for Semantic Segmentation in Agriculture.WE3DS：农业语义分割的 RGB-D 图像数据集。

Sensors (Basel). 2023 Mar 1;23(5):2713. doi: 10.3390/s23052713.

A novel no-sensors 3D model reconstruction from monocular video frames for a dynamic environment.一种用于动态环境的基于单目视频帧的新型无传感器3D模型重建方法。

PeerJ Comput Sci. 2021 May 12;7:e529. doi: 10.7717/peerj-cs.529. eCollection 2021.

引用本文的文献

Amount Estimation Method for Food Intake Based on Color and Depth Images through Deep Learning.基于深度学习的颜色和深度图像的食物摄入量估计方法。

Sensors (Basel). 2024 Mar 22;24(7):2044. doi: 10.3390/s24072044.

本文引用的文献

Human-Mimetic Estimation of Food Volume from a Single-View RGB Image Using an AI System.使用人工智能系统从单视角RGB图像进行食物体积的拟人化估计。

Electronics (Basel). 2021 Jul;10(13). doi: 10.3390/electronics10131556. Epub 2021 Jun 28.

Evaluation of Food-Intake Behavior in a Healthy Population: Personalized vs. One-Size-Fits-All.评估健康人群的饮食行为：个性化与一刀切。

Nutrients. 2020 Sep 15;12(9):2819. doi: 10.3390/nu12092819.

Food Volume Estimation Based on Deep Learning View Synthesis from a Single Depth Map.基于单张深度图的深度学习视图合成的食物体积估计

Nutrients. 2018 Dec 18;10(12):2005. doi: 10.3390/nu10122005.

Mask R-CNN.Mask R-CNN。

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):386-397. doi: 10.1109/TPAMI.2018.2844175. Epub 2018 Jun 5.

FOOD IMAGE ANALYSIS: SEGMENTATION, IDENTIFICATION AND WEIGHT ESTIMATION.食品图像分析：分割、识别与重量估计。

Proc (IEEE Int Conf Multimed Expo). 2013 Jul;2013. doi: 10.1109/ICME.2013.6607548. Epub 2013 Sep 26.

MODEL-BASED FOOD VOLUME ESTIMATION USING 3D POSE.基于模型的三维姿态食物体积估计

Proc Int Conf Image Proc. 2013 Sep;2013:2534-2538. doi: 10.1109/ICIP.2013.6738522. Epub 2014 Feb 13.

Single-View Food Portion Estimation Based on Geometric Models.基于几何模型的单视图食物份量估计

ISM. 2015 Dec;2015:385-390. doi: 10.1109/ISM.2015.67. Epub 2016 Mar 28.

3D localization of circular feature in 2D image and application to food volume estimation.二维图像中圆形特征的三维定位及其在食物体积估计中的应用。

Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:4545-8. doi: 10.1109/EMBC.2012.6346978.

Imaged based estimation of food volume using circular referents in dietary assessment.在饮食评估中使用圆形参照物基于图像估计食物体积。

J Food Eng. 2012 Mar;109(1):76-86. doi: 10.1016/j.jfoodeng.2011.09.031. Epub 2011 Oct 6.

Volume Estimation Using Food Specific Shape Templates in Mobile Image-Based Dietary Assessment.在基于移动图像的膳食评估中使用特定食物形状模板进行体积估计

Proc SPIE Int Soc Opt Eng. 2011 Feb 7;7873:78730K. doi: 10.1117/12.876669.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验