CSE 306: Lab 4: Kernel Rootkit

Due on Sunday, May 10, 2015, 11:59 PM
Note: You may use your remaining late hours on this lab.

Introduction

This lab will introduce you to Linux kernel programming and file system issues, by implmenting a simple unioning and copy-on-write file system overlay. This lab is designed to be a bit more open-ended, so you may add other features for a to-be-determined amount of extra credit.

The course staff have provided you with a simple Linux file system stub as a starting point. Your task will be to take several steps to add features to this simple file system.

Getting Started

We will provide you with some initial source code to start from. To fetch that source, use Git to commit your Lab 3 source, fetch the latest version of the course repository, and then create a local branch called lab4 based on our lab4 branch, origin/lab4:

The git checkout -b command shown above actually does two things: it first creates a local branch lab4 that is based on the origin/lab4 branch provided by the course staff, and second, it changes the contents of your lab directory to reflect the files stored on the lab4 branch. Git allows switching between existing branches using git checkout branch-name, though you should commit any outstanding changes on one branch before switching to a different one.

You will now need to merge the changes you made in your lab3 branch into the lab4 branch, with the git merge lab3 command.

In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g. if you modified some of the code that is changed in the third lab assignment). In that case, the git merge command will tell you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files with git commit -a.

Sharing code with a partner

Important: In this lab, you may work in teams of up to 4 students. We recommend creating larger teams so that you can help each other find your way around the OS kernel. This lab is challenging, and you will benefit from working on larger teams.

Unless we hear otherwise from you, we will assume you are working with the same partner as lab 3. You are welcome to change partners if you like; if you do, please email the course staff immediately to change permissions on your repositories.

We will set up group permission to one team member's git repository on scm. Suppose Partner A is the one handing in the code. Partner A should follow the instructions above to merge the lab4 code. After Partner A has pushed this change to scm, Partner B should simply clone Partner A's repository and use it. For example:

Note that it may take a few days about letting the course staff know your partner selection for the tech staff to apply these permission changes. Again, you are not required to use git to coordinate changes, only to hand in the assignment, but we recommend you learn to use git. You may use any means you like to share code with your partner.

Hand-In Procedure

When you are ready to hand in your lab code and write-up, create a file called slack.txt noting how many late hours you have used both for this assignment and in total. (This is to help us agree on the number that you have used.) This file should contain a single line formatted as follows (where n is the number of late hours):

Then run make handin in the labs directory. If you submit multiple times, we will take the latest submission and count late hours accordingly.

In this and all other labs, you may complete challenge problems for extra credit. If you do this, please create a file called challenge.txt, which includes a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem and how to test it. If you implement more than one challenge problem, you must describe each one.

This lab does not include any questions for you to answer, but you should document your design in the README file.

For Your Safety

Modifying the OS kernel on your system can lose all data on the system! If you introduce a null pointer in a regular program, it crashes and loses all of its data; the same is true of an OS kernel. If you introduce a bug in the OS, it will crash. When an OS crashes, it can corrupt the file system and lose all of your data (but we hope it won't). Thus, it is essential that you do two things to protect yourself.

Snapshot your VM before you start. This can be done through the vSphere client---there is a button to take a snapshot and roll back a snapshot of the VM. Note that this will not save your changes, but will allow you to recreate a corrupted VM on your own (rather than waiting for the system administrator do to this).

Push your code to another machine before testing. Before you install and test kernel code, be sure to use git to commit and push your code to another machine (e.g., scm). That way, if the file system is corrupted, you don't lose your work. If you don't want to inflict untested code on your teammates, create a branch in git.

The Linux VFS

The Linux Virtual File System (or VFS) implements common file system calls, such as open, unlink and read. You can think of the VFS as an abstract class that implements common routines, but calls low-level hook functions. In other words, a file system like ext4 implements some hooks, such as ext4_unlink which is called by the VFS during an unlink system call that cannot be serviced from cache. We provided you with a simple, memory-only file system that includes some stubs and some implementation for the hooks you will need. Your job will be to extend these basic hooks with more functionality.

The VFS also caches the results of common hooks, such as read and lookup (checking if a file path exists or not). A call like read may only call the ext4_read call on a cache miss.

You will primarily interact with the first three objects, and most hooks will take these objects as arguments and return values.

Finally, one hook worth explaining is ioctl. Ioctl is a "kitchen-sink" call, which takes an opcode as an argument. Essentially, ioctl can be used like a second system call table, to implement random operations. We will (ab)use ioctl to manage our file system.

Unioned File System Views

Most file systems store data directly, such as on a disk or send data over the network to a remote server. However, we can be more creative.

You will implement a file system view, which aggregates files from other file systems and directories, and makes them look like they are in one big directory together. For instance, if I union /foo and /bar into directory /boo, when I ls /boo, I should see the contents of both directories.

Why do this? Well, because we can. And perhaps because we want to sandbox some code to a single directory with a utility like chroot, and have everything that code needs in one directory.

Copy-on-Write Overlay

Initially, all changes within the view will pass through to the original file system. In the context of sandboxing dodgy code, we may want to make the view copy-on-write, where all changes are tracked, but only in RAM. Once the overlay is unmounted, the changes are destroyed, rather than written to disk.

Building the skeleton code

In the lab4 directory, type make to build the wolfs.ko kernel module. You can load the kernel module by typing sudo insmod ./wolfs.ko; the module can be unloaded by typing sudo rmmod wolfs. You can view all loaded modules by reading the output of the lsmod command. You can confirm that the wolfs was loaded by checking the output of dmesg:

Our WolFS will also include utilities that will add and remove paths (files or directories) from the unioned view, using ioctl.

You will not need to make the unioned view persistent. In other words, the unioned view will need to be created after each reboot. Changes made in copy-on-write mode can be discarded once the file system is unmounted; changes made in non-copy-on-write mode should persist to the original file system normally.

Understanding the Skeleton Code

All kernel modules provide an init and exit method, which are called when the module is loaded and unloaded, respectively. This is currently all the wolfs includes; many modules will create devices or register other callbacks.

The wolfs also provides examples of how to find kernel functions that are not exported. Some functions are exported explicitly as symbols: a module can simply call these.

In other cases, functions are only meant to be called within the kernel. In these situations, we use kallsyms_lookup_name to find the address of the function, and cast it to the appropriate symbol. For instance, the provided code has an example of how to find the unmap_page_range, kernel-private function.

Note: You probably will not need the particular functions we provide---don't feel like you are doing anything wrong if you don't use them. These are provided only as examples of how to find functions you may need.

Helpful Resources

The best resource to finding kernel function is the Linux Cross-Reference (LXR), located at http://lxr.free-electrons.com/. This site includes a number of useful features that can help you find your way through the source code.

These books (available through the campus Safari Online subscription), also are a helpful reference in understanding Linux kernel code:

Debugging the kernel

Attaching a debugger to a running kernel is tricky, especially if you try to run the debugger on the same machine! But it is possible. We will give you a few tips that can help you when printk and intuition aren't enough.

Installing vmlinux

The first thing you will need is an uncompressed vmlinux file (your VM is actually booting a vmlinuz file, which is a compressed kernel image). Issue the following commands to get an uncompressed kernel image with debugging symbols:

Once this is finished, you should see a file in /usr/lib/debug/boot such as vmlinux-3.2.0-40-generic-pae, in addition to the similarly named vmlinuz files.

Attaching to the running kernel.

You can inspect variable values on your running kernel using the command below (substituting the version of the running kernel as appropriate). However, this approach will not let you set breakpoints, only inspect values)

Running your kernel in a system emulator

You can also run your kernel/wolfs inside of qemu (or another emulator), and attach gdb to the emulator. Qemu emulates x86 hardware, and can run a complete OS kernel. Qemu can also export the gdb protocol over a network socket, allowing more access to the running kernel.

Download a disk image here, which is a file that qemu will treat like a disk. The disk image contains a simple Ubuntu file system (very similar to your VM in many respects). You can mount the disk image and add files to it using commands as below:

At this point, a window should pop up. If a window cannot, or you get X errors, log into your VM again, being sure to provide the '-X' option to ssh.

The window will say that qemu is stopped. It is waiting for gdb to attach to it and continue. Start gdb in another window as below:

From here, you should be able to set breakpoints, inspect memory, break execution, etc. You can type c to continue execution.

Core Assignment

Basic Unioning Support.

When you initially load and mount a WolFS file system, it will be empty. We need to implement support in WolFS to add directories and files of other file systems to the volume.

We provide two user utilities that will issue ioctl system calls to WolFS, as well as stub handlers to catch these ioctl calls in WolFS. Your job is to implement supporting data structures to keep track of these paths, and integrate them with the rest of the VFS hooks.

As far as an implementation strategy, we recommend starting with just the internal bookkeeping on this exercise, and adding the VFS-level hooks in the next exercise. In general, we recommend creating dentries for all of the files under the directory, but adding the DCACHE_OP_REVALIDATE flag, which gives WolFS a chance to double check that the file is still in the remote directory.

Exercise 1. (15 points) Implement the missing ioctl calls to add, remove, and list the directories in the unioned view. The listing ioctl need only print to the kernel log (using printk). For now, it is sufficient to test the basic bookkeeping; you will implement the VFS hooks in the next exercise.

Robust file system command support.

The next step is to actually implement common file system calls, such as open, read, and write. We have provided a number of hook functions for you to fill in or extend, such as wolfs_setattr.

In the current exercise, any of these opertations should be applied to the original files: a write to a file in WolFS should write to the original file; unlinking the file from WolFS should unlink the file from the original file system.

Note: For this and the remaining exercises, feel free to be creative and implement each task how you think best. The handout will have suggestions, but there may be multiple approaches that will work. Be sure to document your approach in the README file.

Exercise 2. (40 points) Implement the wolfs_readdir and wolfs_lookup hooks. The output of ls should make sense, and you should be able to open files. When a file is opened through WolFS, the same permissions in WolFS, not the original file system, should be checked. Be sure to test both read and read-write mode.

Exercise 3. (45 points) Implement unlink, chmod, and the other missing metadata stub functions. For now, changes to permissions, removed files, and other changes should be reflected in the original file system. Explore all reasonable shell behavior and make sure everything works.
Note that you may not need every provided hook; for the provided hooks (prefixed with wolfs_), figure out its role in the VFS and see. Although we tried to provide stubs for all likely metadata hooks, we may have missed something, so do check that command-line functionality really works---don't simply stop once all stubs are filled in.

Copy-on-write view.

In the final, challenge exercise, we will implement a copy-on-write option (specified at mount time with -o cow). This is optional, but recommended and worth a fair bit of extra credit. In copy-on-write mode, and changes are reflected in a private, temporary copy. For instance, if I create a new file in COW mode, the file is not present on the original file system. If I write to a file, the writes are not applied to the original file system, but can be re-read. If I unlink a file in COW mode, the file is not unlinked from the original file system, but is not visible to subsequent ls calls. In fact, a robust implementation would let me create a different copy of the unlinked file.

None of the COW view need be persistent. After you unmount, all changes may be discarded.

Hint: The COW view can be created either lazily (as you access things) or eagerly (one big copy at mount time). As long as the OS doesn't run out of memory, either is ok. Our advice is to take the lazy approach and not worry too much about consistency with concurrent changes to the original file system.

Hint: Consider making copies of the inodes (and contents) and removing the revalidate flag once a file is modified.

Challenge Exercise 6. (25 points - optional) Implement COW mode. We have provided support for the mount option and a global flag (COW mode). You need to modify your other hook implementation to use a copy-on-write view.

This completes the lab. Make sure you rigorously test your code, document the design well, and hand in your work with make handin.