## Cache design tradeoffs using Simplescalar and CACTI

Prepared by:

Abeer Hyari



#### Outline

- Design options
- Simulator
- Workloads
- Results
- Conclusions

## **Design options**

My simulation will be in three parts:

- I. Examine the effect of changing the L1 data and instruction cache on: miss rate, access time, and power consumption.
- II. Examine the effect of changing the associatively of L1 data cache on: miss rate, access time, and power consumption.
- III. Examine the effect of memory hierarchy on the miss penalty.

## Simulator program

- SimpleScalar -3.0 simulator suite includes a wide range of simulation tools ranging from simple functional (instruction only, no timing) simulators to detailed performance (instruction plus timing) simulators.
- sim-cache -> a multi-level cache simulator.
- CACTI 4.1 "Cache Access and Cycle Timing Information" is an integrated cache and memory access time, cycle time, area, leakage, and dynamic power model.

## Workloads

• The subset of SPEC CPU2000 benchmarks that I used in my study were compiled to produce PISA binaries which I downloaded them from [5], and the inputs for these binaries downloaded from[8].

| SPEC CPU2000-float | SPEC CPU 2000-integer |
|--------------------|-----------------------|
| Ammp,equake        | Parser,bzip2          |

# Results (I)

I examined the effect of changing the L1 data cache over the misses rate, access time and power consumption using one of my benchmarks which is (ammp).







tot dynamic Read power (w)





# Results(II)

changing the cache parameters to configure it as direct mapped(one-way), 2-way, 4-way, and 8-way set associtive. Then re-run sim-cache on my four SPEC 2000 benchmarks, then I ran the CACTI 4.1 to measure the access time and the power for each configration.



2-way

1-way

# Results (III)

• Unfortunately, I couldn't examine the effect of memory hierarchy on miss penalty, because the CACTI 4.1 doesn't support multilevel cache.

## Conclusions

- As the gap between processor speed and memory speed is expanding, it is very important to consider tradeoffs for power and performance of cache memories.
- Memory hierarchy design is based on three important principles:
- I. Make the common case.
- **II**. Principle of locality.
- III. Smaller is faster.

#### Future work

 more accurate performance prediction will be possible by combining multi-level cache simulator with power consideration to calculate miss penalty effectively.

#### Thank you

## Refernces

- [1] John L. Hennessy, David A. Patterson. "Computer Architecture: A Quantitative Approach", 4th Edition .
- [2] http://en.wikipedia.org/wiki/CPU\_cache .
- [3] Todd Austen, Eric Larson, Dan Ernst. SimpleScalar: an infrastructure for computer system modeling. February 2002 IEEE.
- [4]http://www.simplescalar.com.
- [5] http://www.simplescalar.com/benchmarks.html.
- [6] http://www.hpl.hp.com/research/cacti/
- [7] David Tarjan, Shyamkumar Thoziyoor, Norman P. Jouppi. CACTI 4.0 . HP Laboratories Palo Alto. HPL-2006-86.

[8] http://students.cs.tamu.edu/baiksong/teaching/cpsc614/spec2000args.tgz
[9] http://www.spec.org/cpu2000/CFP2000/188.ammp/docs/188.ammp.html
[10] http://www.spec.org/cpu2000/CINT2000/197.parser/docs/197.parser.html
[11] http://www.spec.org/cpu2000/CINT2000/256.bzip2/docs/256.bzip2.html
[12] http://www.spec.org/cpu2000/CFP2000/183.equake/docs/183.equake.html