ECEn 324 Homework Set #5
Submit your (hardcopy) solutions to the problems below in the homework
box by 5:00 PM on the assigned date.
- Click here to get the source
code to a simple cache simulator that is incomplete. Your assignment
is to provide the missing code at each of the 3 points marked by the
comment "// Your code goes here". You should be able to figure out
what is needed by studying the remaining code and the comments. Test
your code by simulating the cache with the default cache
configuration (2-way associative, 128-byte cache, 4-byte blocks) and
the default reference sequence in the cachesim.c file. The cache is
small enough and the reference sequence short enough that you should
be able to hand-verify the correctness of your results.
you are confident that your code works on the initial test case, run
the simulator on two additional cache configurations and reference
sequences of your choosing. Imagine that you are part of a team
tasked to test this code. Pick two quite different cache
configurations, and for each carefully devise a reference sequence
that you think constitutes a good test of its correctness. (The
reference sequence need not be lengthy -- they don't need to exceed
about 20 references.) If you find a bug, be sure to report it!
- Problem 6.39 from the text.
- Problem 6.40 from the text.
- Click here to get a tar file
with some source code that you just analyzed for problems 6.39
and 6.40 from the text. Let's see what we can learn from running
similar code on a real machine. Typing "make" in the cacheperf
directory should produce a "cacheperf" executable that you can
run. Look through the source code of "cacheperf.c" to see what it
does. Note that, for each of the three code portions measured, the
program outputs both raw cycle counts and summaries of relative run
time, where the base length is the run time with a warm cache and
good locality. The results obtained by running the program should
give us insight into (a) performance differences between cold and
warm caches, and (b) performance differences between access patterns
with good and poor locality. You should recompile and run the
program from a wide variety of SQSIZE values (in
"cacheperf.c"). Include one or more tables of results, and write a
paragraph that answers the questions below, along with anything else
you observed that is noteworthy.
- Unlike most other measurement code we've used this semester,
the results this program reports are based on single
measurements. (It would be tricky to repeat cold cache performance
measures.) How consistent are the results of this program if you
run it many times?
- For what values of SQSIZE is the difference in performance
the greatest between having a warm cache and a cold cache? Explain
- For what values of SQSIZE is the difference in performance
the greatest between reference sequences with good and poor
locality? Explain your results.
- How do your results change if you edit the Makefile and
change the optimization level used to compile cacheperf.c to
"-O3"? Explain the differences you observe. (You might glance at
the assembly output of the inner loops to see how the compiler
optimized the resulting instructions.)
- Create a memory mountain for at least one machine that you have
access to. Click here to get a
tar file for code to compute the memory mountain. The code should
compile and run on most Linux and Macintosh systems. Look through
the code in mountain.c to see how it works. When you run the
program, it should create a "mountain.res" file of resulting
throughput values that you can examine. The tar file includes a
"mountain.xlsx" Excel file set up to compute the memory mountain
surface. On a machine with Excel (you can use the college machines
in the computer room), open the "mountain.res" file, then cut and
paste the values to the worksheet in "mountain.xlsv".
Write a paragraph about your results that answers the questions
below, along with any other observations you'd care to make.
- How do your results vary if you edit the Makefile and change
the optimization level that mountain.c is compiled with? Try a few
experiments and report on the differences.
- How does your memory mountain compare with the one in the
text from the Intel Core i7? (Compare both overall shape and
actual throughput measures.)
- What information about your test system can be deduced from
the memory mountain? (Compare with information about the measured
system -- check /proc/cpuinfo on a Linux machine, for example.)
- Problem 7.12
- Let's try to construct an example where the actions of the
linker create an "insidious run-time bug" (p. 665). Start with the
code in this tar file. The
Makefile compiles files "intdblconf1.c" and "intdblconf2.c" to
produce the executable "intdblconf". As the file names suggest, the
two files have conflicting definitions for a particular variable,
declared in one file as an int and in the other as a double. Look
through the source code, see what it does, and what the output
means. Edit the source code and change the initialization of
variable x to experiment with different weak and strong symbol
scenarios. Try at least one different computer platform if you can,
and experiment with different optimization levels in the
Makefile. Also, make particular note of warning or error messages
you get from the linker. Write a paragraph that summarizes what you
found. What is the nature of the insidious run-time bug you
discovered, and what is required to produce it? Does the linker
help us out in avoiding these kinds of problems with a suitable
For problem 1, include the source code from your main()
function (with your code additions). Include the output of the
simulator for the initial cache configuration. For each of your two
new configurations, include the program output detailing the cache
parameters, the reference sequence, and the outcome. For each
configuration, say something about how you chose the reference
sequence, the characteristics of your cache organizations that they
test, and how you know that the simulator is correct.
For problems 2 and 3 (6.39 and 6.40), no programming or source code is required.
For problem 4, no source code need be included in your
submission. Think through the results you report until you feel that
you can explain them.
For problem 5, no source code is required in your submission, but you
should include a printout of the memory mountain (the graph from the
Excel spreadsheet) along with your written comments about that surface
and what you can deduce from it. Even though this version of the
memory mountain code makes repeated measurements of each data point,
the accuracy of the results can vary widely with the system load; try
to run it on a machine that is otherwise mostly idle. (Try running
"top" to check the load average and CPU-intensive processes that might
be running in the background.) Finally, you really should try this on
different types of machines if you can -- the differences in the
surface can be startling. Mention in your report if you made memory
mountains on multiple platforms.
For problem 6 (7.12), the problem is asking you to identify the line
number, the address of the starting byte the label represents, and the
value contained in memory starting at that location as seen from the
linked file (Fig 7.10).
No programming or source code is
required for problem 7.12.
For problem 7, include source code and a screenshot of the output that
clearly shows one instance where the bug arises, and explain the bug
in your writeup.
Last updated 2 May 2013