Publications by Year: 2010

2010
Vijay Reddi, Svilen Kanev, Wonyoung Kim, Simone Campanoni, Michael Smith, Gu Wei, and David Brooks. 12/4/2010. “Voltage smoothing: Characterizing and mitigating voltage noise in production processors via software-guided thread scheduling.” In 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. IEEE. Publisher's VersionAbstract
Parameter variations have become a dominant challenge in microprocessor design. Voltage variation is especially daunting because it happens so rapidly. We measure and characterize voltage variation in a running Intel Core2 Duo processor. By sensing on-die voltage as the processor runs single-threaded, multi-threaded, and multi-program workloads, we determine the average supply voltage swing of the processor to be only 4%, far from the processor’s 14% worst-case operating voltage margin. While such large margins guarantee correctness, they penalize performance and power efficiency. We investigate and quantify the benefits of designing a processor for typical-case (rather than worst-case) voltage swings, assuming that a fail-safe mechanism protects it from infrequently occurring large voltage fluctuations. With today’s processors, such resilient designs could yield 15% to 20% performance improvements. But we also show that in future systems, these gains could be lost as increasing voltage swings intensify the frequency of fail-safe recoveries. After characterizing microarchitectural activity that leads to voltage swings within multi-core systems, we show that a voltage-noise-aware thread scheduler in software can co-schedule phases of different programs to mitigate error recovery overheads in future resilient processor designs.
Voltage smoothing: Characterizing and mitigating voltage noise in production processors via software-guided thread scheduling
Karpelson Michael, Whitney P, Gu Wei, and Wood J. 10/18/2010. “Energetics of flapping-wing robotic insects: towards autonomous hovering flight.” In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Pp. 1630–1637. IEEE. Publisher's VersionAbstract
Flapping-wing mechanisms inspired by biological insects have the potential to enable a new class of small, highly maneuverable aerial robots with hovering capabilities. In order for such devices to operate without an external power source, it is necessary to address a complex system design challenge: the integration of all of the required components on board the robot. This paper discusses the flight energetics of flapping-wing robotic insects with the goal of selecting design parameters that enable power autonomy and maximize flight time. The subsystems of the robot are analyzed both from a broad perspective and using a detailed set of models for a piezoelectrically driven two-wing design. The models are used to perform a system-level optimization for the maximum flight time permitted by current technology, compare the resulting robot configurations to biological insects across several key metrics, and discuss the effect of performance gains in various subsystems of the robot.
Energetics of flapping-wing robotic insects: towards autonomous hovering flight
Benjamin Lee and David Brooks. 9/2010. “Applied inference: Case studies in microarchitectural design.” ACM Transactions on Architecture and Code Optimization (TACO), 7, 2, Pp. 8. Publisher's VersionAbstract

We propose and apply a new simulation paradigm for microarchitectural design evaluation and optimization. This paradigm enables more comprehensive design studies by combining spatial sampling and statistical inference. Specifically, this paradigm (i) defines a large, comprehensive design space, (ii) samples points from the space for simulation, and (iii) constructs regression models based on sparse simulations. This approach greatly improves the computational efficiency of microarchitectural simulation and enables new capabilities in design space exploration.

We illustrate new capabilities in three case studies for a large design space of approximately 260,000 points: (i) Pareto frontier, (ii) pipeline depth, and (iii) multiprocessor heterogeneity analyses. In particular, regression models are exhaustively evaluated to identify Pareto optimal designs that maximize performance for given power budgets. These models enable pipeline depth studies in which all parameters vary simultaneously with depth, thereby more effectively revealing interactions with nondepth parameters. Heterogeneity analysis combines regression-based optimization with clustering heuristics to identify efficient design compromises between similar optimal architectures. These compromises are potential core designs in a heterogeneous multicore architecture. Increasing heterogeneity can improve bips3/w efficiency by as much as 2.4×, a theoretical upper bound on heterogeneity benefits that neglects contention between shared resources as well as design complexity. Collectively these studies demonstrate regression models' ability to expose trends and identify optima in diverse design regions, motivating the application of such models in statistical inference for more effective use of modern simulator infrastructure.

Applied inference: Case studies in microarchitectural design
Vijay Reddi, Simone Campanoni, Meeta Gupta, Michael Smith, Gu Wei, David Brooks, and Kim Hazelwood. 9/2010. “Eliminating voltage emergencies via software-guided code transformations.” ACM Transactions on Architecture and Code Optimization (TACO), 7, 2, Pp. 1-28. Publisher's VersionAbstract
In recent years, circuit reliability in modern high-performance processors has become increasingly important. Shrinking feature sizes and diminishing supply voltages have made circuits more sensitive to microprocessor supply voltage fluctuations. These fluctuations result from the natural variation of processor activity as workloads execute, but when left unattended, these voltage fluctuations can lead to timing violations or even transistor lifetime issues. In this paper, we present a hardware-software collaborative approach to mitigate voltage fluctuations. A checkpoint-recovery mechanism rectifies errors when voltage violates maximum tolerance settings, while a run-time software layer reschedules the program’s instruction stream to prevent recurring violations at the same program location. The run-time layer, combined with the proposed code rescheduling algorithm, removes 60% of all violations with minimal overhead, thereby significantly improving overall performance. Our solution is a radical departure from the ongoing industry standard approach to circumvent the issue altogether by optimizing for the worst case voltage flux, which compromises power and performance efficiency severely, especially looking ahead to future technology generations. Existing conservative approaches will have severe implications on the ability to deliver efficient microprocessors. The proposed technique reassembles a traditional reliability problem as a runtime performance optimization problem, thus allowing us to design processors for typical case operation by building intelligent algorithms that can prevent recurring violations.
Eliminating voltage emergencies via software-guided code transformations
Benton Calhoun and David Brooks. 7/2010. “Can Subthreshold and Near-Threshold Circuits Go Mainstream?” Micro, IEEE, 30, 4, Pp. 80–85. Publisher's VersionAbstract
Recent research has shown the potential benefits of subthreshold or near-threshold operation, which gives up a substantial degree of speed in order to reduce energy per operation. This is an excellent trade-off for many tasks, such as cyberphysical systems. This prolegomenon summarizes the benefits and challenges of subthreshold or near-threshold operation.
Can Subthreshold and Near-Threshold Circuits Go Mainstream?
Yakun Sophia Shao, Judson Porter, Michael Lyons, Gu-Yeon Wei, and David Brooks. 7/2010. “Power, Performance and Portability: System Design Considerations for Micro Air Vehicle Applications.” Sixth International Summer School on Advanced Computer Architecture and Compilation for Embedded Systems (ACACES). Publisher's VersionAbstract
Recent years have seen an increased interest in Micro Air Vehicles (MAVs) with applications ranging from search-and-rescue to mimicking insect behavior. MAVs have several challenging design requirements that impact processor design. These include real time processing demands and severe power/weight budgets. In this paper, we describe the characteristics of MAV applications and propose hardware acceleration to improve the power, performance, and portability of MAV system designs.
Power, Performance and Portability: System Design Considerations for Micro Air Vehicle Applications
Xiaoyao Liang, David Brooks, and Gu Wei. 2/3/2010. “Process variation tolerant circuit with voltage interpolation and variable latency.” United States of America. Publisher's VersionAbstract
A circuit having dynamically controllable power. The circuit comprises a plurality of pipelined stages, each of the pipelined stages comprising two clocking domains, a plurality of switching circuits, each switching circuit being connected to one of the pipelined stages, first and second power sources connected to each of the plurality of pipelined stages through the switching circuits, the first power source supplying a first voltage and the second power source supplying a second voltage, wherein the first and second power sources each may be applied to a pipelined stage independently of other pipelined stages, first and second complementary clocks, and a plurality of latches connected to the first and second complementary clocks and to the plurality of pipelined stages for proving latch-based clocking to control the first and second clocking domains and to enable time-borrowing across the plurality of switching circuits. The first voltage differs from the second voltage and the plurality of pipelined stages interpolates between the first and second voltages to provide differing effective voltages between the first and second voltages.
Process variation tolerant circuit with voltage interpolation and variable latency
Michael Lyons, Mark Hempstead, Gu Wei, and David Brooks. 2/2010. “The Accelerator Store framework for high-performance, low-power accelerator-based systems.” IEEE Computer Architecture Letters, 9, 2, Pp. 53-56. Publisher's VersionAbstract
Hardware acceleration can increase performance and reduce energy consumption. To maximize these benefits, accelerator- based systems that emphasize computation on accelerators (rather than on general purpose cores) should be used. We introduce the “accelerator store,” a structure for sharing memory between accelerators in these accelerator-based systems. The accelerator store simplifies accelerator I/O and reduces area by mapping memory to accelerators when needed at runtime. Preliminary results demonstrate a 30% system area reduction with no energy overhead and less than 1% performance overhead in contrast to conventional DMA schemes.
The Accelerator Store framework for high-performance, low-power accelerator-based systems
Vijay Reddi, Meeta Gupta, Glenn Holloway, Michael Smith, Gu Wei, and David Brooks. 1/2010. “Predicting voltage droops using recurring program and microarchitectural event activity.” IEEE Micro, 30, 1. Publisher's VersionAbstract
Shrinking feature size and diminishing supply voltage are making circuits more sensitive to supply voltage fluctuations within a microprocessor. If left unattended, voltage fluctuations can lead to timing violations or even transistor lifetime issues. A mechanism that dynamically learns to predict dangerous voltage fluctuations based on program and microarchitectural events can help steer the processor clear of danger.
Predicting voltage droops using recurring program and microarchitectural event activity