Fabre William, Haroun Karim, Lorrain Vincent, Lepecq Maria, Sicard Gilles
Université Paris-Saclay, CEA, List, F-91120 Palaiseau, France.
Sensors (Basel). 2024 Aug 22;24(16):5446. doi: 10.3390/s24165446.
In modern cyber-physical systems, the integration of AI into vision pipelines is now a standard practice for applications ranging from autonomous vehicles to mobile devices. Traditional AI integration often relies on cloud-based processing, which faces challenges such as data access bottlenecks, increased latency, and high power consumption. This article reviews embedded AI vision systems, examining the diverse landscape of near-sensor and in-sensor processing architectures that incorporate convolutional neural networks. We begin with a comprehensive analysis of the critical characteristics and metrics that define the performance of AI-integrated vision systems. These include sensor resolution, frame rate, data bandwidth, computational throughput, latency, power efficiency, and overall system scalability. Understanding these metrics provides a foundation for evaluating how different embedded processing architectures impact the entire vision pipeline, from image capture to AI inference. Our analysis delves into near-sensor systems that leverage dedicated hardware accelerators and commercially available components to efficiently process data close to their source, minimizing data transfer overhead and latency. These systems offer a balance between flexibility and performance, allowing for real-time processing in constrained environments. In addition, we explore in-sensor processing solutions that integrate computational capabilities directly into the sensor. This approach addresses the rigorous demand constraints of embedded applications by significantly reducing data movement and power consumption while also enabling in-sensor feature extraction, pre-processing, and CNN inference. By comparing these approaches, we identify trade-offs related to flexibility, power consumption, and computational performance. Ultimately, this article provides insights into the evolving landscape of embedded AI vision systems and suggests new research directions for the development of next-generation machine vision systems.
在现代网络物理系统中,将人工智能集成到视觉流水线中如今已成为从自动驾驶车辆到移动设备等各种应用的标准做法。传统的人工智能集成通常依赖基于云的处理,这面临着诸如数据访问瓶颈、延迟增加和功耗高等挑战。本文回顾了嵌入式人工智能视觉系统,研究了结合卷积神经网络的近传感器和传感器内处理架构的多样化格局。我们首先对定义人工智能集成视觉系统性能的关键特性和指标进行全面分析。这些特性和指标包括传感器分辨率、帧率、数据带宽、计算吞吐量、延迟、功率效率和整体系统可扩展性。理解这些指标为评估不同的嵌入式处理架构如何影响从图像捕获到人工智能推理的整个视觉流水线奠定了基础。我们的分析深入探讨了近传感器系统,这些系统利用专用硬件加速器和商用组件在靠近数据源的位置高效处理数据,将数据传输开销和延迟降至最低。这些系统在灵活性和性能之间实现了平衡,能够在受限环境中进行实时处理。此外,我们还探索了将计算能力直接集成到传感器中的传感器内处理解决方案。这种方法通过显著减少数据移动和功耗,同时实现传感器内特征提取、预处理和卷积神经网络推理,满足了嵌入式应用的严格需求限制。通过比较这些方法,我们确定了与灵活性、功耗和计算性能相关的权衡。最终,本文深入探讨了嵌入式人工智能视觉系统不断演变的格局,并为下一代机器视觉系统的开发提出了新的研究方向。