IEEE Trans Biomed Circuits Syst. 2016 Aug;10(4):855-63. doi: 10.1109/TBCAS.2016.2545402. Epub 2016 Jun 10.
This paper presents an energy-efficient and high-throughput architecture for Sparse Distributed Memory (SDM)-a computational model of the human brain [1]. The proposed SDM architecture is based on the recently proposed in-memory computing kernel for machine learning applications called Compute Memory (CM) [2], [3]. CM achieves energy and throughput efficiencies by deeply embedding computation into the memory array. SDM-specific techniques such as hierarchical binary decision (HBD) are employed to reduce the delay and energy further. The CM-based SDM (CM-SDM) is a mixed-signal circuit, and hence circuit-aware behavioral, energy, and delay models in a 65 nm CMOS process are developed in order to predict system performance of SDM architectures in the auto- and hetero-associative modes. The delay and energy models indicate that CM-SDM, in general, can achieve up to 25 × and 12 × delay and energy reduction, respectively, over conventional SDM. When classifying 16 × 16 binary images with high noise levels (input bad pixel ratios: 15%-25%) into nine classes, all SDM architectures are able to generate output bad pixel ratios (Bo) ≤ 2%. The CM-SDM exhibits negligible loss in accuracy, i.e., its Bo degradation is within 0.4% as compared to that of the conventional SDM.
本文提出了一种用于稀疏分布内存(SDM)的高能效、高吞吐量架构,这是一种人类大脑的计算模型[1]。所提出的 SDM 架构基于最近提出的用于机器学习应用的内存内计算内核,称为计算内存(CM)[2],[3]。CM 通过将计算深深嵌入到内存阵列中,实现了能量和吞吐量效率。采用了特定于 SDM 的技术,如分层二进制决策(HBD),进一步降低了延迟和能量。基于 CM 的 SDM(CM-SDM)是一种混合信号电路,因此为了预测 SDM 架构在自动和异联想模式下的系统性能,在 65nm CMOS 工艺中开发了针对电路感知的行为、能量和延迟模型。延迟和能量模型表明,CM-SDM 通常可以分别实现高达 25 倍和 12 倍的延迟和能量减少,与传统 SDM 相比。在将具有高噪声水平(输入坏像素比:15%-25%)的 16×16 二进制图像分类为九个类别时,所有 SDM 架构都能够生成输出坏像素比(Bo)≤2%。CM-SDM 的准确性几乎没有损失,即其 Bo 劣化在 0.4%以内,与传统 SDM 相比。