COMP 530: Lab 2: Malloc/Free

In this homework, you will become familiar with how memory allocators, such as malloc and free work. You will implement a simple, single-threaded malloc and free, for small objects. This will be a real implementation, in that it will actually work as a drop-in replacement for the libc malloc and free implementations on classroom.cs.unc.edu (at least for some programs).

This assignment does not involve writing that many lines of code (probably hundreds), but the hard part is figuring out the few lines of delicate code to write, as well as writing careful unit tests for each step. We strongly recommend starting early and writing many test cases.

Getting the starter code

You will need to click on this link to create a private repository for your code.

Once you have a repository, you will need to clone your private repository (see the URL under the green "Clone or Download" button, after selecting "Use SSH". For instance, if your private repo is called lab2-team-don:

We provided you with some "skeleton code" for the malloc and free implementation. The baseline code defines all of the data structures you will need to implement malloc/free, as well as provides an outline of how you might implement these functions. Note that you may, of course, start from scratch or use any structure you like; however, you will probably find following this skeleton to be more productive.

To build the code, and a very, very simple test case, type make. You will get some warnings about undefined variables; these are hints for you. By the time you are finished, you shoudl not have warnings. The compiler will generate two binary files: th_alloc.so (the "Tar Heel Allocator", i.e., your malloc and free implementations), and test, a simple demo application.

To use your th_alloc.so library, we will use a directive called LD_PRELOAD, which basically steals spots in your application's dynamic linking jump tables before libc is loaded. The other functions will still use libc as normal. We provide test to illustrate how this should be used. It attempts to malloc one buffer, and then print the address of the buffer. In th_alloc.c, malloc is not implemented, and returns NULL. In libc, malloc returns a sensible buffer address.

In principle, th_alloc should be usable with any Linux application. In practice, once finished, it will work with a limited range of applications. In the interest of keeping the assignment simple, we are not going to support: multiple threads (there is a bogus stub for pthread_create to make any threaded application die immediately), allocations larger than 2K, and several variants of malloc that some applications use. Nonetheless, this lab should at least work with a wide range of test cases you write, as well as, some simple Linux applications. As a challenge problem, you are welcome to make this library more robust.

Helpful References

Exercise 0. (0 points) Read this paper (off-campus link) describing Hoard. Don't be discouraged if you don't understand all aspects of the paper just yet. But a general sense of their design will be useful in understanding the C code provided to you.

Then watch the make-up lecture video.

Do this before you start the assignment.

Debugging

Now is probably a very good time to familiarize yourself with gdb, or another C debugger. The problem with malloc and free is that they are a foundation for other common C libraries. For instance, printf may allocate memory to interpret a pattern. Where is this memory going to come from? And what will happen if printf calls malloc, calls printfs, calls malloc, ...?

For some comic relief during this assignment, Professor James Mickens has thoroughly-bemoaned this aspect of systems work in this article. (e.g., "I HAVE NO TOOLS BECAUSE I'VE DESTROYED MY TOOLS WITH MY TOOLS.").

Core assignment

You will first complete the implementation of malloc. We will begin by explaining some of the data structures provided to you for organization, and then sketch the procedure to follow.

Data Structures

Like Hoard, we will allocate objects from a superblock. In th_alloc, a superblock is one 4KB page, divided into equal-sized objects. The space for the first object in each page is reserved for some bookkeeping (see the struct superblock_bookkeeping). The rest of the space in the page will be kept on a free list (superblock->bkeep->free_list), until a call to malloc() removes it from the free list and returns it. A call to free() will return this object to the free list.

Each superblock only allocates objects of a given power of two. We will support object allocations up to 2KB; any allocation smaller than 32 bytes will be rounded up to 32 bytes. Any allocation that is not a power of two will be rounded up to the next-largest power of two.

Thus, we will have a pool of superblocks for each power of two, from 32 to 2048 (see the levels array in the code). Each level tracks the total number of free objects, how many superblocks have no allocated objects (for Exercise 4), and a pointer to a list of superblocks.

Initially, your allocator will have no superblocks allocated. So, a considerable part of your malloc implementation will involve populating the superblock pools, as well as traversing them

Note: This lab will require some clever use of casting and pointer math. This is fundamental to taking a blob of memory and turning it into subdivided, typed objects you are used to using. We have tried to give you some helper functions and boilerplate code for the nastier bits of this pointer-casting melee, but feel free to ask questions on piazza or during office hours if you do not understand the provided code.

Implementing Malloc

Exercise 1. (9 points) Complete the implementation of malloc. Write at least 5 test cases for malloc(), as well as however many additional unit tests within malloc you consider appropriate.

For the moment, you do not need to implement free().

The first step you need to do is complete the size2level function, which converts an allocation size to the correct power of two, and then maps it onto the correct index into the levels table. Important note: remember that level 0 is 32.

The second step will be to complete the alloc_super helper function. You will need to allocate one 4KB page of anonymous memory. We provide code to put this superblock on the superblock pool for the correct level, and initialize the free list. You will need to make sure the bookkeeping for the free object counts is correct, and, in general, make sure all of the bookkeeping fields are appropriately initialized.

We initialize the free list for you. Each object in the superblock is actually represented with a union of a singly-linked list pointer and a raw data pointer. What is going on here is that the free list is actually stored inside the free object, saving a bit of space. You will want to understand that code, so that you can complete the final step.

The final step is to complete the malloc implementation. Here, you will need to remove an entry from the free list, as well as decrement any appropriate counters. One important note: if you take the first object from an otherwise completely-free super block, decrement the whole_superblocks count. This will be useful for Exercise 4.

At this point, you should have a working malloc. We strongly recommend testing what you have up to this point thoroughly before moving on.

Implementing Free

Exercise 2. (6 points) Complete the implementation of free(), and write at least 5 additional tests (for a total of 10 test cases).

The main job of free() is to place an object back on the free list. We have, again provided a helper function to do the pointer math (obj2bkeep()) to map an object back to a superblock. The key idea is that we place the bookkeeping at the beginning of the page, so we just need to drop the "low order" bits to get the starting address of the page.

In addition to adding the object back to the free list, you will also need to be sure to update the statistics in the superblock and level about how many objects are free and how many superblocks have all objects free.

Memory Poisoning

An abundantly common source of errors have to do with dangling application pointers. These errors include using an object that has been freed, or using a field of an object without properly initializing it (inadvertently using an old value).

One common debugging technique to combat these types of errors is called memory poisoning. The basic idea is to fill newly-allocated, or newly-freed objects with a value that is unlikely to be mapped (other than 0). If this value is read, you will get a page fault; you can determine the type of bug by looking at the faulting address. If it looks like a poison value, you know you tried to follow a pointer in a freed data structure.

Exercise 3. (3 points) Implement memory poisoning in malloc and free, using the FREE_POISON and ALLOC_POISON patterns. Hint, check out memset(). These patterns should not just be for a single byte, but all bytes in the free area (except for the free list pointer).

Be sure to write some test cases for poisoning.

Returning Pages to the OS

Sometimes programs allocate many objects, then free many objects, but continue executing. In these cases, a well-behaved allocator will eventually return superblocks to the OS. Here, we define a number of free superblocks to keep (2, or RESERVE_SUPERBLOCK_THRESHOLD). Your job will be to extend free() to release superblocks if more entire superblocks are available than this threshold.

Exercise 4. (7 points) When more than 2 superblocks are completely free, release them back to the OS, using munmap, and update the appropriate bookkeeping in your allocator.

Be sure to write a test case or two that ensure that super blocks are actually being freed when appropriate, and that only free memory is being unmapped.

Challenge! (2 bonus points) Handle allocations larger than 4KB. Our advice would be to simply mmap the regions. The hard part is figuring out, on free, which regions are which, and unmapping on a large free.

Challenge! (5 bonus points) Implement thread safety, by locking the relevant data structures during allocation and free. This will be very challenging, as we have not covered this in class. Consider reserving this challenge for after lab 3. In short, though, the tricky issue is that you may not want to use libpthread (to avoid circular dependencies), but write your own spinlock.

Challenge! (5+ bonus points) Implement additional allocator functions, such as calloc, realloc, and friends. Credit varies based on how many additional functions are implemented, how clean the implementations are, and how well-tested.

Coda

If your malloc implementation is robust, it should work to preload this with real applications. Many applications may allocate larger regions of memory, use unimplemented functions (like realloc), or may use threads. It is ok if this doesn't work for all applications, but you may want to try your mall with a few simple command-line utilities, just for the pride of accomplishment.

Hand-In Procedure

Type make handin in the lab directory. You may submit more than once; if you do this, the most recent submission will be graded. The way submission works is basically that you create a tag in git, which is then pushed to github. If you look at your lab page on github, you will see a pulldown list with "Branch: master". If you drop this list down, you will see an option to view tags. If you choose the tag handin, you will be able to view your submitted code and confirm that it is correct.

All programs will be tested on classroom.cs.unc.edu. All programs, unless otherwise specified, should be written to execute in the current working directory. Your correctness grade will be based solely on your program's performance on classroom.cs.unc.edu. Make sure your programs work on classroom!

Generally, unless the homework assignment specifies otherwise, you should compile your program using the provided Makefile (e.g., by just typing make on the console). Do not add any special command line arguments ("flags") or compiler options to the Makefile.

The program should be neatly formatted (i.e., easy to read) and well-documented. In general, 75% of your grade for a program will be for correctness, 25% for "programming style" (appropriate use of language features [constants, loops, conditionals, etc.], including variable/procedure/class names), and documentation (descriptions of functions, general comments [problem description, solution approach], use of invariants, pre-and post conditions where appropriate).

Make sure you put your name(s) in a header comment in every file you submit. Any de-anonymizing information, such as your name, should be enclosed in redacted comments (wrapped in @* *@).