Publications by Year: 2013

2013
Xuan Zhang, David Brooks, and Gu Wei. 11/11/2013. “A 20uW 10MHz Relaxation Oscillator with Adaptive Bias and Fast Self-Calibration in 40nm CMOS for Micro-Aerial Robotics Application.” In IEEE Asian Solid-State Circuits Conference (ASSCC). Publisher's VersionAbstract
Efficient actuation control of flapping-wing microrobots requires a low-power frequency reference with good absolute accuracy. To meet this requirement, we designed a fully-integrated 10MHz relaxation oscillator in a 40nm CMOS process. By adaptively biasing the continuous-time comparator, we are able to achieve a power consumption of 20μW, a 68% reduction to the conventional fixed bias design. A built-in self-calibration controller enables fast post-fabrication calibration of the clock frequency. Measurements show a frequency drift of 1.2% as the battery voltage changes from 3V to 4.1V.
A 20uW 10MHz Relaxation Oscillator with Adaptive Bias and Fast Self-Calibration in 40nm CMOS for Micro-Aerial Robotics Application
Tao Tong, Xuan Zhang, Wonyoung Kim, David Brooks, and Gu Wei. 9/22/2013. “A Fully Integrated Battery-Connected Switched-Capacitor 4:1 Voltage Regulator with 70% Peak Efficiency Using Bottom-Plate Charge Recycling.” In IEEE Custom Integrated Circuits Conference (CICC). Publisher's VersionAbstract
This work presents a switched-capacitor (SC) DC-DC voltage regulator that converts a 3.7V battery voltage down to ~0.8V in order to power the `brain' SoC of a flapping-wing microrobotic bee. A cascade of two 2:1 SC converters offers high efficiency for a 4:1 conversion ratio. A charge recycling technique reduces the flying capacitor's bottom-plate parasitic loss by 50% and overall conversion efficiency reaches 70%. The output droop is less than 10% of the nominal output voltage for a worst-case 47mA load step.
A Fully Integrated Battery-Connected Switched-Capacitor 4:1 Voltage Regulator with 70% Peak Efficiency Using Bottom-Plate Charge Recycling
Xuan Zhang, Tao Tong, David Brooks, and Gu Wei. 9/22/2013. “Supply-Noise Resilient Adaptive Clocking for Battery-Powered Aerial Microrobotic System-on-Chip in 40nm CMOS.” In IEEE Custom Integrated Circuits Conference (CICC). Publisher's VersionAbstract
A battery-powered aerial microrobotic System-on-Chip (SoC) has stringent weight and power budgets, which requires fully-integrated solutions for both clock generation and voltage regulation. Supply-noise resilience is important yet challenging for such SoC systems due to a non-constant battery discharge profile and load current variability. This paper proposes an adaptive-frequency clocking scheme that can tolerate supply noise and improve performance when implemented with an integrated voltage regulator (IVR). Measurements from a `brain' SoC, implemented in 40nm CMOS, demonstrate 2× performance improvement with adaptive-frequency clocking over conventional fixed-frequency clocking. Combining adaptive-frequency clocking with open-loop IVR extends error-free operation to a wider battery voltage range (2.8 to 3.8V) with higher average performance.
Supply-Noise Resilient Adaptive Clocking for Battery-Powered Aerial Microrobotic System-on-Chip in 40nm CMOS
Mario Lok, David Brooks, Robert Wood, and Gu Wei. 9/15/2013. “Design and analysis of an integrated driver for piezoelectric actuators.” In IEEE Energy Conversion Congress and Exposition. Publisher's VersionAbstract
Small-scale, highly maneuverable, flapping-wing robotic insects have a wide range of applications, including exploration, environmental monitoring, search and rescue, and surveillance. For these small-scale robots, a piezoelectric cantilever actuator driven by a high voltage drive signal is a preferred actuation mechanism. The generation of this drive signal via light and efficient power electronics is critical given the limited weight budget for the flapping-wing robot. Previous work demonstrated actuator drive circuitry using discrete power transistors and passive elements. This paper presents a new design that integrates all the power FETs into a single monolithic IC, reducing the weight of the power electronics to fit within the weight budget. This design adds the capability of driving multiple outputs to accommodate recent electromechanical design advances for flying robots.
Design and analysis of an integrated driver for piezoelectric actuators
Xuan Zhang, Tao Tong, Svilen Kanev, Sae Lee, Gu Wei, and David Brooks. 9/4/2013. “Characterizing and Evaluating Voltage Noise in Multi-Core Near-Threshold Processors.” In International Symposium on Low Power Electronics and Design (ISLPED). Publisher's VersionAbstract
Lowering the supply voltage to improve energy efficiency leads to higher load current and elevated supply sensitivity. In this paper, we provide the first quantitative analysis of voltage noise in multi-core near-threshold processors in a future 10nm technology across SPEC CPU2006 benchmarks. Our results reveal larger guardband requirement and significant energy efficiency loss due to power delivery nonidealities at near threshold, and highlight the importance of accurate voltage noise characterization for design exploration of energy-centric computing systems using near-threshold cores.
Characterizing and Evaluating Voltage Noise in Multi-Core Near-Threshold Processors
Yakun Shao and David Brooks. 9/4/2013. “Energy Characterization and Instruction-Level Energy Model of Intel's Xeon Phi Processor.” In International Symposium on Low Power Electronics and Design (ISLPED). Publisher's VersionAbstract
Intel’s Xeon Phi is the first commercial many-core/multi-thread x86-based processor. Xeon Phi belongs to a new breed of high performance computing processors that seek high compute density as well as energy efficiency. However, no high- level energy model is available for Xeon Phi software developers to quickly evaluate and optimize energy efficiency. This work demonstrates an instruction-level energy model for the Xeon Phi processor to facilitate the development of energy-efficient software. In order to construct this model, we first characterize the energy consumption of the processor, identifying how energy per instruction scales with the number of cores, the number of active threads per core, and instruction types. Based on the energy characterization, we construct an instruction-level energy model and validate the accuracy of the model between 1% and 5% for real world benchmarks. We show that the energy model can be used to identify software inefficiencies for these benchmarks and find that Linpack code can be optimized to increase energy efficiency by as much as 10%.
Energy Characterization and Instruction-Level Energy Model of Intel's Xeon Phi Processor
Brandon Reagen, Yakun Shao, Gu Wei, and David Brooks. 9/4/2013. “Quantifying Acceleration: Power/Performance Trade-Offs of Application Kernels in Hardware.” In International Symposium on Low Power Electronics and Design (ISLPED). Publisher's VersionAbstract
As the traditional performance gains of technology scaling diminish, one of the most promising directions is building special purpose fixed function hardware blocks, commonly referred to as accelerators. Accelerators have become prevalent in industrial SoC designs for their low power, high performance potential. In this work we explore thousands of implementations of classical software workloads in hardware. This thorough, detailed design space search of hardware accelerators gives architects a quantita- tive way to reason about the differences in implementations. The exploration presented in this work shows that the space is full of poor design choices. By thoroughly analyzing each benchmark, we show which provide the most performance when implemented in hardware given a fixed power budget and explain which design techniques work best for each workload.
Quantifying Acceleration: Power/Performance Trade-Offs of Application Kernels in Hardware
Hayun Chung and Gu Wei. 8/16/2013. “ADC-based backplane receiver design-space exploration.” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 22, 7, Pp. 1539–1547. Publisher's VersionAbstract
Demand for higher throughput backplane communications, coupled with a desire for design portability and flexibility, has led to high-speed backplane receivers that use front-end analog-to-digital converters (ADCs) and digital equalization. Unfortunately, power and complexity of such receivers can be high and require careful design. This paper presents a parameterized ADC-based backplane receiver model that facilitates design-space exploration to optimize the tradeoffs between power and performance-an accurate behavioral model of front-end ADCs is presented for performance estimation and detailed power models for the digital equalizer (EQ) blocks are developed for power estimation. Model-based simulations suggest that comparator offset correction resolution is the most critical ADC design parameter when an overall receiver performance is concerned. Further receiver design-space exploration reveals that a Pareto optimal frontier exists, which can be used as a guideline to set the initial receiver configurations depending on a given power and performance constraints.
ADC-based backplane receiver design-space exploration
Yakun Shao and David Brooks. 4/21/2013. “ISA-Independent Workload Characterization and its Implications for Specialized Architectures.” In International Symposium on Performance Analysis of Systems and Software (ISPASS). Publisher's VersionAbstract
Specialized architectures will become increasingly important as the computing industry demands more energy- efficient designs. The application-centric design style for these architectures is heavily dependent on workload characterization of intrinsic program characteristics, but at the same time these architectures are likely to be decoupled from legacy ISAs. In this work, we perform ISA-independent workload characterization for a variety of important intrinsic program characteristics relating to computation, memory, and control flow. The analysis is performed using a JIT compiler that emits ISA-independent instructions. We compare this analysis with an x86 trace and find that several of the analyses are highly sensitive to the ISA. We conclude that designers of specialized architectures must adopt ISA-independent workload characterization approaches.
ISA-Independent Workload Characterization and its Implications for Specialized Architectures
Robert Wood, Nagpal Radhika, and Gu Wei. 3/2013. “Flight of the robobees.” Scientific American, 308, 3, Pp. 60–65. Publisher's VersionAbstract
Not too long ago a mysterious affliction called colony collapse disorder (CCD) began to wipe out honeybee hives. These bees are responsible for most commercial pollination in the U.S., and their loss provoked fears that agriculture might begin to suffer as well. In 2009 the three of us, along with colleagues at Harvard University and Northeastern University, began to seriously consider what it would take to create a robotic bee colony. We wondered if mechanical bees could replicate not just an individual’s behavior but the unique behavior that emerges out of interactions among thousands of bees. We have now created the first RoboBees—flying bee-size robots—and are working on methods to make thousands of them cooperate like a real hive.
Flight of the robobees
Svilen Kanev, Timothy Jones, Gu Wei, David Brooks, and Vijay Reddi. 2013. “Measuring code optimization impact on voltage noise”.Abstract
In this paper, we characterize the impact of compiler optimizations on voltage noise. While intuition may suggest that the better processor utilization ensured by optimizing compilers results in a small amount of voltage variation, our measurements on a Intel Core2 Duo processor show the opposite – the majority of SPEC 2006 benchmarks exhibit more voltage droops when aggressively optimized. We show that this increase in noise could be sufficient for a net performance decrease in a typical-case, resilient design.
Measuring code optimization impact on voltage noise
Svilen Kanev, Timothy Jones, Gu Wei, David Brooks, and Vijay Reddi. 2013. “Measuring code optimization impact on voltage noise.” Workshop in Silicon Errors – System Effects (SELSE). Publisher's VersionAbstract
In this paper, we characterize the impact of compiler optimizations on voltage noise. While intuition may suggest that the better processor utilization ensured by optimizing compilers results in a small amount of voltage variation, our measurements on a Intel Core2 Duo processor show the opposite – the majority of SPEC 2006 benchmarks exhibit more voltage droops when aggressively optimized. We show that this increase in noise could be sufficient for a net performance decrease in a typical-case, resilient design.
Measuring code optimization impact on voltage noise