A binary-activation, multi-level weight RNN and training algorithm for processing-in-memory inference with eNVM

Abstract:

We propose a new algorithm for training neural networks with binary activations and multi-level weights, which enables efficient processing-in-memory circuits with embedded nonvolatile memories (eNVM). Binary activations obviate costly DACs and ADCs. Multi-level weights leverage multi-level eNVM cells. Compared to existing algorithms, our method not only works for feed-forward networks (e.g., fully-connected and convolutional), but also achieves higher accuracy and noise resilience for recurrent networks. In particular, we present an RNN-based trigger-word detection PIM accelerator, with detailed hardware noise models and circuit co-design techniques, and validate our algorithm's high inference accuracy and robustness against a variety of real hardware non-idealities.
Last updated on 04/21/2022