Suppr超能文献

用于人工智能应用的FinFET 6T-SRAM全数字内存计算:概述与分析

FinFET 6T-SRAM All-Digital Compute-in-Memory for Artificial Intelligence Applications: An Overview and Analysis.

作者信息

Gul Waqas, Shams Maitham, Al-Khalili Dhamin

机构信息

Department of Electronics, Carleton University, 1125 Colonel Bay Drive, Ottawa, ON K1S 5B6, Canada.

出版信息

Micromachines (Basel). 2023 Jul 31;14(8):1535. doi: 10.3390/mi14081535.

Abstract

Artificial intelligence (AI) has revolutionized present-day life through automation and independent decision-making capabilities. For AI hardware implementations, the 6T-SRAM cell is a suitable candidate due to its performance edge over its counterparts. However, modern AI hardware such as neural networks (NNs) access off-chip data quite often, degrading the overall system performance. Compute-in-memory (CIM) reduces off-chip data access transactions. One CIM approach is based on the mixed-signal domain, but it suffers from limited bit precision and signal margin issues. An alternate emerging approach uses the all-digital signal domain that provides better signal margins and bit precision; however, it will be at the expense of hardware overhead. We have analyzed digital signal domain CIM silicon-verified 6T-SRAM CIM solutions, after classifying them as SRAM-based accelerators, i.e., near-memory computing (NMC), and custom SRAM-based CIM, i.e., in-memory-computing (IMC). We have focused on multiply and accumulate (MAC) as the most frequent operation in convolution neural networks (CNNs) and compared state-of-the-art implementations. Neural networks with low weight precision, i.e., <12b, show lower accuracy but higher power efficiency. An input precision of 8b achieves implementation requirements. The maximum performance reported is 7.49 TOPS at 330 MHz, while custom SRAM-based performance has shown a maximum of 5.6 GOPS at 100 MHz. The second part of this article analyzes the FinFET 6T-SRAM as one of the critical components in determining overall performance of an AI computing system. We have investigated the FinFET 6T-SRAM cell performance and limitations as dictated by the FinFET technology-specific parameters, such as sizing, threshold voltage (V), supply voltage (V), and process and environmental variations. The HD FinFET 6T-SRAM cell shows 32% lower read access time and 1.09 times better leakage power as compared with the HC cell configuration. The minimum achievable supply voltage is 600 mV without utilization of any read- or write-assist scheme for all cell configurations, while temperature variations show noise margin deviation of up to 22% of the nominal values.

摘要

人工智能(AI)通过自动化和独立决策能力彻底改变了当今的生活。对于人工智能硬件实现而言,6T-SRAM单元因其相较于同类产品的性能优势而成为合适的选择。然而,诸如神经网络(NN)等现代人工智能硬件经常访问片外数据,这会降低整个系统的性能。内存计算(CIM)减少了片外数据访问事务。一种CIM方法基于混合信号域,但它存在位精度有限和信号裕度问题。另一种新兴方法使用全数字信号域,该方法提供了更好的信号裕度和位精度;然而,这将以硬件开销为代价。在将数字信号域CIM硅验证的6T-SRAM CIM解决方案分类为基于SRAM的加速器(即近内存计算(NMC))和基于定制SRAM的CIM(即内存内计算(IMC))之后,我们对其进行了分析。我们将重点放在乘法累加(MAC)上,因为它是卷积神经网络(CNN)中最频繁的操作,并比较了最先进的实现方式。权重精度较低(即<12b)的神经网络显示出较低的精度,但具有较高的功率效率。8b的输入精度可满足实现要求。报告的最高性能在330 MHz时为7.49 TOPS,而基于定制SRAM的性能在100 MHz时最高为5.6 GOPS。本文的第二部分分析了FinFET 6T-SRAM,它是决定人工智能计算系统整体性能的关键组件之一。我们研究了由FinFET技术特定参数(如尺寸、阈值电压(V)、电源电压(V)以及工艺和环境变化)所决定的FinFET 6T-SRAM单元的性能和局限性。与HC单元配置相比,HD FinFET 6T-SRAM单元的读取访问时间降低了32%,泄漏功率提高了1.09倍。对于所有单元配置,在不使用任何读取或写入辅助方案的情况下,可实现的最低电源电压为600 mV,而温度变化会使噪声裕度偏差高达标称值的22%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b37/10456776/eaff291c8a21/micromachines-14-01535-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验