David Brooks, Margaret Martonosi, John Wellman, and Pradip Bose. 12/2001. “
Power-performance modeling and tradeoff analysis for a high end microprocessor.” Power-Aware Computer Systems, Pp. 126–136.
Publisher's VersionAbstractWe describe a new power-performance modeling toolkit, developed to aid in the evaluation and definition of future power-efficient, PowerPC TM processors. The base performance models in use in this project are: (a) a fast but cycle-accurate, parameterized research simulator and (b) a slower, pre-RTL reference model that models a specific high-end machine in full, latchaccurate detail. Energy characterizations are derived from real, circuit-level power simulation data. These are then combined to form higher-level energy models that are driven by microarchitecture-level parameters of interest. The overall methodology allows us to conduct power-performance tradeoff studies in defining the follow-on design points within a given product family. We present a few experimental results to illustrate the kinds of tradeoffs one can study using this tool.
Power-performance modeling and tradeoff analysis for a high end microprocessor Russ Joseph, David Brooks, and Margaret Martonosi. 8/2001. “
Live, runtime power measurements as a foundation for evaluating power/performance tradeoffs.” Workshop on Complexity Effectice Design WCED, held in conjunction with ISCA, 28.
AbstractOf the many ways one could gauge the complexity-effectiveness of a design or design element, one candidate approach is to consider a design's power/performance tradeoffs. This paper describes our early-stage results in a broad effort to evaluate the power-performance tradeoffs of a range of benchmarks and microarchitectures. In particular, this paper presents power data collected on-the-fly on real x86 machines as they execute carefully-constructed microbenchmarks. The microbenchmarks exercise aspects of the system such as data cache and branch predictor. They are parametrically-variable to consider how load dependence, cache miss rate, branch mispredict rate, and branch distance all impact the power and performance of a CPU. For example, from these experiments, we learn that CPU performance increases essentially monotonically with cache hit rate, while CPU power encounters a maximum at roughly 80-90% cache hit rates. Likewise, we show results demonstrating that performance-neutral issues such as bit populations in the data cache values can display interesting power trends. While the experimental results are preliminary, we feel that the techniques described in this paper will o er a useful foundation for a broad range of power/performance tradeoffs.
Alper Buyuktosunoglu, Stanley Schuster, David Brooks, Pradip Bose, Peter Cook, and David Albonesi. 6/11/2001. “
An adaptive issue queue for reduced power at high performance.” Power-Aware Computer Systems, Pp. 25–39.
Publisher's VersionAbstract
Increasing power dissipation has become a major constraint for future performance gains in the design of microprocessors. In this paper, we present the circuit design of an issue queue for a superscalar processor that leverages transmission gate insertion to provide dynamic low-cost configurability of size and speed. A novel circuit structure dynamically gathers statistics of issue queue activity over intervals of instruction execution. These statistics are then used to change the size of an issue queue organization on-the-fly to improve issue queue energy and performance. When applied to a fixed, full-size issue queue structure, the result is up to a 70% reduction in energy dissipation. The complexity of the additional circuitry to achieve this result is almost negligible. Furthermore, self-timed techniques embedded in the adaptive scheme can provide a 56% decrease in cycle time of the CAM array read of the issue queue when we change the adaptive issue queue size from 32 entries (largest possible) to 8 entries (smallest possible in our design).
An adaptive issue queue for reduced power at high performance