ECEn 324 Homework Set #3

Submit your (hardcopy) solutions to the problems below in the homework box by 5:00 PM on the assigned date.

  1. Problem 3.56
  2. Problem 3.59
  3. Problem 3.69
  4. Problem 4.43
  5. Problem 4.45
  6. Problem 4.47
  7. You are to determine how big stacks can grow to be.

    For part A, write C code that calls a single function with a large stack-allocated array, and see how big that array can be.

    For part B, write C code that calls a function recursively, and see what the maximum recursion depth is.

    For both parts A and B, what error message do you get when you overflow the stack? Do you get different results if you run on a different machine (different OS or different version of Linux)? Do you get different results compiling for the IA32 ("-m32") or for the x86-64 ISA (with "-m64")?
    If you are running the bash shell (pretty common on Linux and Macs), you can find the default stack size (and other system information) by typing "ulimit -a". (If you are running the tcsh shell, there is a "limit" command that gives similar but not identical functionality.) Does the listed size agree with the results of your experiments? The "ulimit" command also allows you to set the stack size with the "-s" option. (You can readily find the details with an internet search.) Try your experiments again after increasing the stack size. Did it do what you expected? (If you can't get the ulimit/limit commands to work, make a note of this in your submission.)
    Finally, try compiling with the "-Os" optimization flag for gcc. (It tells the compiler to optimize for size but not at the expense of speed.) See if this affects the recursion depth you observed for a given stack size.
    The code here is intended to serve as a starting point. Make sure you understand what it is doing if you use it. Note that it just tries sizes and recursion depths that are powers of two. You'll have to plug different values in for N and M to see where it breaks.

Clarifications

Problem 3.56: Your submission should include answers to parts A through E for the problem from the book. For part F, create a file with your best guess as to the original loop() function and compile it to assembly (use the "-S" and "-m32" flags, and use different optimization levels to match the assembly code in the book as closely as possible). Your submission should include your C source file, the resulting assembly code for the function, and the compiler options you used. (You don't have to run your code or show any execution results.) As a convenience, you can get the assembly code to annotate here if desired.

Problem 3.59: You should compile the C code for switch_prob() and include the assembly code that resulted. Try a number of compiler options to come as close as you can to the assembly language in the book. Your submission should include your C source file for the function, the resulting assembly code, the compiler options you used for gcc, and the machine you ran it on. (As of Jan 2011, one sees very different code generated by gcc under linux vs. gcc under MacOS X for this code. Keep this in mind if you are completing the assignment using a Mac.) You can get the assembly code to annotate here if desired.

Problem 3.69: You should compile the C code you come up with for the trace() function, and you should try to match the x86-64 assembly code in the book. Your submission should include your C source code, the resulting assembly, and the compiler options you used to get that assembly code.

Problem 4.43: No C code or execution results are required.

Problem 4.45: Let's start with a big-picture observation: the easiest way to get from any C program to a working Y86 version of the same is to (a) alter the C code so it maps to the Y86 instruction set more efficiently, and then (b) hand translate the x86 assembly code that GCC produces into Y86 assembly code. Part A of this problem corresponds to (a) above: by revising the code to use pointers rather than array accesses, GCC's output will be closer to what you can do in the Y86 instruction set. When you think you have a pointer version that works, naturally you'll want to test it to ensure that the two versions of the code are equivalent before going on to the next step. Use something along the lines of the test code available here to make sure that both versions of the bubble sort function are equivalent. Your submission should include not only the C code in your new version of the function, but results from testing that show that the functions are equivalent.

The next step is to compile your source code (using "-S" and "-m32" arguments to GCC) and hand translate the bubble_p() function to Y86 code. (Experiment with different optimization levels and choose a version of assembly code that will require the least amount of work to turn into Y86 code.) Use the Y86 code framework here so you can run and test your solution using Y86 simulation tools. (Note that the data to be sorted in this code is the same as in the C test code above.)

Finally, run your code using the yas and yis programs (a Y86 assembler and instruction set emulator, respectively) that can be obtained by clicking here. Place the file tools.tar into a newly created directory and type "tar xvf *". This will create a tools subdirectory and place in it all the files required to build yas and yis. Now type "cd tools" and then type "make" and new yis and yas executables should be built. If your Y86 code is in the file bubble.ys in the parent directory, you can assemble it by typing "tools/yas bubble.ys" and then you should run the resulting object file on the simulator by typing "tools/yis bubble.yo". (Check the README file in the tools directory for a bit more information about the yas and yis programs.) In creating any Y86 program for the yis simulator, bear in mind that the largest allowed memory address is 0x0FFFF.

Make sure the results of execution are as you expected. Include the simulator output from the execution of your program in your submission with some brief annotation that explains how you know it is correct.

Problem 4.47: No C code or execution results are required.

Problem 7 (determining the stack size): Include the source code you used and some screen shots or results from running it with a few different values. Write a paragraph for each part (A and B) explaining what you discovered. Are the results for both parts consistent with each other? Are results consistent for different operating systems and for both the IA32 and x86-64? Be sure to mention the impact of the -Os compiler flag and changing the stack size with ulimit.


Last updated 2 May 2013