ECEn 424 Homework Set #8
Upload a pdf file containing your solutions to the problems below to
Learning Suite before 11:00pm on the assigned date.
- Problem 6.26
- Problem 6.27
- Problem 6.29
- Problem 6.38
- Problem 6.39
- Problem 424-7:
Click here to get a tar file
with some source code that you just analyzed for problems 6.38
and 6.39 from the text. Let's see what we can learn from running
similar code on a real machine. Typing "make" in the cacheperf
directory should produce a "cacheperf" executable that you can
run. Look through the source code of "cacheperf.c" to see what it
does. Note that, for each of the three code portions measured, the
program outputs both raw cycle counts and summaries of relative run
time, where the base length is the run time with a warm cache and
good locality. The results obtained by running the program should
give us insight into (a) performance differences between cold and
warm caches, and (b) performance differences between access patterns
with good and poor locality. You should recompile and run the
program from a wide variety of SQSIZE values (in
"cacheperf.c"). Include one or more tables of results, and write a
paragraph that answers the questions below, along with anything else
you observed that is noteworthy.
- Unlike most other measurement code we've used this semester,
the results this program reports are based on single
measurements. (It would be tricky to repeat cold cache performance
measures.) How consistent are the results of this program if you
run it many times?
- For what values of SQSIZE is the difference in performance
the greatest between having a warm cache and a cold cache? Explain
your results.
- For what values of SQSIZE is the difference in performance
the greatest between reference sequences with good and poor
locality? Explain your results.
- How do your results change if you edit the Makefile and
change the optimization level used to compile cacheperf.c to
"-O3"? Explain the differences you observe. (You might glance at
the assembly output of the inner loops to see how the compiler
optimized the resulting instructions.)
Clarifications
No programming or source code is required for the problems
from the text.
Problem 424-7: No source code need be included in your
submission. Think through the results you report until you feel that
you can explain them.
Apparently you have to make a slight modification to the code to get
it to work on a Mac. In the timer.c file (included in cacheperf.tar),
insert the word "volatile" immediately after "asm" in the
access_counter function. With this fix, it is reported that you can
use higher levels of optimization, contrary to the comments in
timer.c.
Last updated 1 April 2019