Chip Gallery

Chip prototyping provides several important benefits for our research. Silicon implementation provides the opportunity to learn about power and variability issues with real measurements in ways that simulations alone cannot provide, and our chip prototypes allow us to more convincingly demonstrate the benefits of our proposed approaches. In addition, the design process instills an appreciation of complexity, testing, and validation issues encountered when creating real hardware. Our group has designed prototype chips for several projects.

We thank IBM, TSMC, SRC, and UMC for fabrication support for these projects.

25mm2 SoC in 16nm FinFET

targeting flexible acceleration of compute intensive kernels in DNN, DSP and security algorithms. The SoC includes an always-on subsystem,  a dual-core Arm A53 CPU cluster, an embedded FPGA array (eFPGA), and a quad-core cache coherent accelerator (CCA) cluster. Measurement results demonstrate the following observations:

  • Accelerator flexibility-efficiency (GOPS/W) range spans from 3.1x (A53+SIMD), to 16.5x (eFPGA), to 54.5x (CCA) compared to the dual-core CPU baseline on comparable tasks.  
  • Energy per inference on MobileNet-128 CNN shows a peak improvement of 47.6x.

A 16nm 25mm2 SoC with a 54.5x Flexibility-Efficiency Range from Dual-Core Arm Cortex-A53 to eFPGA and Cache-Coherent Accelerators

SMIV: A 16nm SoC with Efficient and Flexible DNN Acceleration for Intelligent IoT Devices

25mm2 SoC in 16nm FinFET targeting flexible acceleration of compute intensive kernels in DNN, DSP and security algorithms

16nm programmable accelerator (PGMA) for unsupervised probabilistic machine perception tasks

that performs Bayesian inference on probabilistic models mapped onto a 2D Markov Random Field, using MCMC.

Exploiting two degrees of parallelism, it performs Gibbs sampling inference at up to 1380x faster with 1965x less energy than an Arm CortexA53 on the same SoC, and 1.5x faster with 6.3x less energy than an embedded FPGA in the same technology.

A 3mm2 Programmable Bayesian Inference Accelerator for Unsupervised Machine Perception using Parallel Gibbs Sampling in 16nm

A Scalable Bayesian Inference Accelerator for Unsupervised Learning

16nm programmable accelerator (PGMA) for unsupervised probabilistic machine perception tasks