Project 3 - Thread Library

Assigned:	Monday, February 26
Due Date:	Friday, April 6, 11:59 pm
Collaboration Policy:	Level 1
Group Policy:	Pair-optional (but recommended!)

In the previous project, you used a provided user-level thread library to implement a concurrent disk scheduler. In this project, you will implement the thread library itself! Your thread library will provide exactly the same API that you used in your disk scheduler, and will allow you to run any concurrent program that uses this API (including your disk scheduler).

Implementing the thread library is considerably more complex than writing the disk scheduler. Additionally, this project will require you to write and submit tests along with your library itself. Plan appropriately and start early!

Project Overview

Your task is to implement the full thread library API as detailed in Project 2. In particular, you must write each of the thread_ operations defined in the thread.h header used by all clients of the thread library.

Note that while you are implementing the thread library itself, you will still be provided with the functional interrupt library. This means that you will not need to implement start_preemptions, as this function is implemented within the interrupt library (even though it is also part of the user-facing thread API). Your thread library implementation will also make use of two other functions provided by the interrupt library that are not part of the user-facing API (interrupt_enable and interrupt_disable) to facilitate atomicity.

One of the challenges of this project will be understanding the representation of threads and the infrastructure provided by Linux to support user-level thread libraries. Other challenges you will need to tackle include providing appropriate atomicity within the thread library and ensuring robustness against arbitrary (and potentially invalid) usage by clients of the library.

Specification

In addition to implementing the publicly-defined interface of thread.h, your library should observe the following specifications.

Scheduling Order

Your library should follow these rules when deciding how to order threads:

All scheduling queues should be FIFO. This includes the ready queue, the queue of threads waiting for a monitor lock, and the queue of threads waiting for a condition variable signal. Locks should be acquired by threads in the order in which the locks are requested (by thread_lock or in thread_wait).
When a thread calls thread_create, the caller does not yield the CPU. The newly created thread is put on the ready queue but is not executed right away.
When a thread calls thread_unlock, the caller does not yield the CPU. The woken thread (if one exists) is put on the ready queue but is not executed right away.
When a thread calls thread_signal or thread_broadcast, the caller does not yield the CPU. The woken thread(s) are put on the ready queue but not executed right away. These threads will request the lock when they next run.

Termination

When there are no runnable threads in the system (e.g. all threads have finished, or all threads are deadlocked), your thread library should execute the following code to terminate the program:

    cout << "Thread library exiting.\n";
    exit(0);

Note that this message is the only output the thread library itself should ever produce.

Error Handling

The thread library API defines all functions to return 0 on success (except for thread_libinit) and -1 on failure. Your functions must be as robust as possible, and should handle every possible error without crashing.

This specification intentionally does not provide you with an exhaustive list of errors you should handle. OS programmers must have a healthy sense of paranoia to make their system robust, so part of this project is thinking of and handling lots of different error types. A few types of errors are not possible to handle due to the thread library existing in userspace (thus, for example, the user program could corrupt the memory used by the thread library). However, most types of errors should be gracefully caught by the library.

Certain behaviors might or might not be considered errors from the library's perspective. Here is a list of questionable behaviors that your library should not consider to be errors:

Signaling a CV without holding the monitor lock (while discouraged, doing so is generally not considered an error in Mesa semantics)
Exiting a thread while holding a lock (the thread should keep the lock)
Deadlock

Questionable behaviors that should be considered errors include:

Acquiring a lock you already own
Initializing the thread library more than once

Ask if you're unsure about whether any other specific behaviors should be considered errors. Note that errors can arise not only from invalid use of the library but also due to other factors (such as running out of memory). Some possible sources of errors that you might otherwise overlook are highlighted in the implementation advice section.

Remember that errors should be handled silently, returning -1 but not producing any output.

Memory Usage

The library should not leak memory over time as threads are created and destroyed. After a thread is finished (i.e., after it returns from the function given in thread_create), you must remember to deallocate the memory used for the thread and its stack. Deallocation does not need to happen immediately after thread termination as long as threads do not pile up over time without deallocation (i.e., don't leak memory).

Test Cases

While methodical testing should be a part of your development process for any program, doing so will be a required (and graded) part of writing your thread library. Writing test cases is common practice in the real world -- software companies maintain a suite of test cases for their programs and use this suite to check the program's correctness after a change (i.e., if a change introduced a bug, hopefully one or more of the existing test cases will fail to indicate that). You will write a comprehensive suite of test cases for your thread library, which will be part of what is submitted to the autograder.

Each test case will be a short C++ program that uses functions in the thread library (e.g. the example program from Project 2). Each test case should be run without any arguments and should not use any input files. Test cases should call exit(0) when run with a correct thread library (normally this will happen when your test case's last runnable thread ends or blocks). If you submit your disk scheduler as a test case, you will need to specify all inputs (number of requesters, queue size, and the list of requests) statically in the program in order to obey these rules.

Your test cases should not call start_preemption, as your test suite will not be evaluated on how thoroughly they exercise the interrupt_enable and interrupt_disable calls in the library.

Your test suite may contain up to 20 test cases, and each test case may generate at most 10 KB of output and may take up to 60 seconds to run (these limits are much larger than needed). You will submit your suite of test cases together with your thread library to the autograder.

The autograder includes a number of buggy thread libraries that misbehave in various ways. Your test suite will be autograded based on how many of the buggy thread libraries are exposed by the test suite. A buggy library is considered exposed if the output on a particular test case differs between the buggy library and a correct library. As with the regular test cases, you will not be told what the actual bugs are within the buggy libraries.

Implementation

This section contains information on implementing various parts of the thread library.

Thread Creation and Context Switching

Linux provides several library calls to help implement user-level thread libraries. The calls you will need are getcontext, makecontext, setcontext, and swapcontext. These calls interact with ucontext_t structs, which contain the information comprising a thread (stack, program counter, register values, and so forth). You will want to read the Linux manual pages for these calls (e.g., man getcontext, or just consult Google). As a summary, here's how to use these calls to create a new thread:

    #include <ucontext.h>

    /*
     * Initialize a context structure by copying the current thread's context.
     * Necessary since ucontext_t objects contain machine-dependent information
     * that will be initialized by copying here.
     */
    getcontext(ucontext_ptr);           // ucontext_ptr has type ucontext_t*

    /*
     * Every thread needs a stack to facilitate making function calls.
     * Your thread library should allocate STACK_SIZE bytes (which is
     * defined in thread.h) for each stack.
     */
    char* stack = new char[STACK_SIZE];
    ucontext_ptr->uc_stack.ss_sp = stack;
    ucontext_ptr->uc_stack.ss_size = STACK_SIZE;
    ucontext_ptr->uc_stack.ss_flags = 0;
    ucontext_ptr->uc_link = NULL;

    /*
     * Direct the new thread to call start(arg1, arg2), as an example.
     * Does NOT actually start executing the new thread; you need
     * to use swapcontext or setcontext for that.
     */
    makecontext(ucontext_ptr, (void(*)()) start, 2, arg1, arg2);

Use swapcontext to save the context of the current thread and switch to the context of another thread. You can use setcontext to set the thread context without saving an existing context. Read the Linux manual pages for more details.

Depending on your library design, you may find the uc_link field of ucontext_t useful (but designs are also possible that do not depend on this field).

Managing `ucontext_t` structs

Do not use ucontext_t structs that are created by copying an existing ucontext_t struct (e.g., via struct assignment or memcpy). Instead, create ucontext structs through getcontext and makecontext, and then manage them by passing or storing pointers. That way, the original ucontext_t struct need never be copied.

Why is it a bad idea to copy a ucontext_t struct? The answer is that you don't know exactly what's in these structs. In particular, a ucontext_t struct happens to contain a pointer to itself (one of its data members). If you copy the struct itself, you will copy the value of this pointer, and the new copy will point to the old copy's data member. If you later deallocate the old copy (e.g., if it was a local variable), then the new copy will point to garbage. Copying structs is also a bad idea for performance.

Unfortunately, it is rather easy to accidentally copy ucontext_t structs. Some of the common ways are:

passing a struct by value into a function
copying the struct into an STL queue
declaring a local ucontext_t variable (almost always a bad idea, since it practically forces you to copy it)

You should probably be using new to allocate ucontext_t structs (or the struct containing a ucontext_t struct, etc). If you use the STL to allocate a ucontext_t struct, make sure that STL class doesn't move its objects around in memory. For example, using vector to allocate ucontext_t structs is a bad idea, because vectors will move memory around when they resize.

You will avoid all of these problems if you stick to working with ucontext_t pointers instead of the structs themselves.

Ensuring Atomicity

To ensure atomicity of multiple operations, your thread library will enable and disable interrupts. Since this is a user-level thread library, it can't manipulate the hardware interrupt mask. Instead, you will interact with the interrupt library (libinterrupt.a) that simulates software interrupts. While applications (such as the disk scheduler) interact with the interrupt library solely via the start_preemptions call, the thread library itself will use several other calls, which are defined in the interrupt.h header file. This file will be included by your thread library, but is not be included by application programs that use the thread library.

The relevant sections of interrupt.h are shown below:

/*
 * interrupt_disable() and interrupt_enable() simulate the hardware's interrupt
 * mask.  These functions provide a way to make sections of the thread library
 * code atomic.
 *
 * assert_interrupts_disabled() and assert_interrupts_enabled() can be used
 * as error checks inside the thread library.  They will assert (i.e. abort
 * the program and core dump) if the condition they test for is not met.
 *
 * These functions/macros should only be called in the thread library code.
 * They should NOT be used by the application program that uses the thread
 * library; application code should use locks to make sections of the code
 * atomic.
 */
extern void interrupt_disable(void);
extern void interrupt_enable(void);

#define assert_interrupts_disabled()          \
    assert_interrupts_private((char*) __FILE__, __LINE__, true)
#define assert_interrupts_enabled()         \
    assert_interrupts_private((char*) __FILE__, __LINE__, false)

Note that the interrupt_disable and interrupt_enable functions will abort the program if you try to call them when interrupts are already disabled or enabled, respectively. You may also note that interrupt.h does not allow you to test whether interrupts are currently enabled. While there is nothing stopping you from tracking the interrupt state yourself (e.g., via a boolean variable), you should not need to do so. Tracking the interrupt state explicitly is probably a sign that your interrupt handling logic isn't quite precise. At any specific point in your library, you should to definitively know whether interrupts are enabled (and should use the assert_interrupts_disabled and assert_interrupts_enabled calls to make sure your assumptions are correct).

Lastly, remember that interrupts should be disabled only when executing in your thread library's code. Any code outside the thread library should never execute with interrupts disabled (which is why applications themselves do not include interrupt.h).

Error Sources

There are three sources of errors that your library (or any OS code, for that matter) should handle. The first and most common source of errors comes from misbehaving user programs (e.g., misusing monitors, releasing an unowned lock, etc). A second source of errors comes from resources that the OS uses, such as hardware devices. Your thread library must detect if one of the lower-level functions it calls returns an error. For example, the C++ new operator may fail if the system is out of memory. By default, this operator will throw an exception if the system is out of memory, but you can also tell C++ to skip the exception and return null instead via the std::nothrow constant, which may be a simpler behavior to work with:

int* p = new (nothrow) int; // allocate memory or set to null

For these first two sources of errors (user errors and OS resource errors), the thread function should detect the error and return -1. User programs can then detect the error and retry or exit.

A third source of errors is when the OS code itself (in this case, your thread library) has a bug. During development (which includes this entire semester), the best behavior in this case is for the OS to detect the bug quickly and assert (this is called a "panic" in kernel parlance). You should use assertion statements copiously in your thread library to check for bugs in your code.

Logistics

The starter files for project 3 are available at p3-starter.tar.gz. As in the last project, you can use wget and tar to download and unpack the files on the class server.

Write your thread library in C++ on Linux. Your library should be written in a single file named thread.cc, while each test case should be written in a separate file (named however you wish ending in .cc). None of the provided header files should be modified. As with Project 2, you may develop on a Mac using the provided Mac interrupt library, but the only supported development environment is on Linux.

The public functions in thread.h are declared extern, but all other functions and global variables in your thread library should be declared static to prevent naming conflicts with programs that link with your thread library. Your program may use any functions included in the standard C++ library, including (and especially) the STL. You should not use any libraries other than the standard C++ library.

Start by implementing thread_libinit, thread_create, and thread_yield. Don't worry at first about disabling and enabling interrupts. After you get that system working, implement the monitor functions. Finally, add calls to interrupt_disable and interrupt_enable to ensure your library works with arbitrary yield points. A correct concurrent program must work for any instruction interleaving. In other words, calls to thread_yield could be inserted anywhere in your code that interrupts are enabled and should not cause incorrect behavior.

Test cases should be designed methodically -- e.g., think of a particular behavior you would like to test, then write a test case that would distinguish a thread library providing the intended (correct) behavior from a thread library providing incorrect behavior. For example, you might write a test verifying that a thread is prohibited from acquiring a lock when another thread already owns it.

You can compile an application program (or test case) named app.cc against your thread library as follows:

g++ -Wall -std=c++11 -o app app.cc thread.cc libinterrupt.a -ldl

To run your disk scheduler using your thread library, you would simply substitute disk.cc for app.cc in the above example.

Note that on a Mac, you will likely need to add two additional flags when compiling: -D_XOPEN_SOURCE and -Wno-deprecated-declarations. The latter is due to the fact that the ucontext.h functions are officially deprecated on modern systems (albeit for esoteric reasons and without any real replacement, hence why we are still using them in this project).

You can submit your program to the autograder as follows:

submit3310 3 thread.cc test1.cc test2.cc ...

Group Reports

If you are working in a group, in addition to your group's final program submission, each group member must individually submit a group report to me by email. Your group report, which will be kept anonymous from your partners, should summarize your contributions to the project as well as those of your partners. Your report does not need to be long (and could be as simple as "we all worked on the entirety of the project together in front of one machine"), but it must be received for your project to be considered submitted.

Group submissions will receive a single grade, but I reserve the right to adjust individual grades up or down from the group grade in the event of a clearly uneven distribution of work.

Writeup

In addition to your program itself, you will also write a short paper (~3-4 pages) that describes your thread library. The purpose of this writeup is to help you gain experience with technical writing. In particular, your paper should include the following:

an introductory section that highlights the purpose of the project
a design section that describes your major design choices and the data structures you used, focusing particularly on how safe synchronization is achieved (if a figure makes your explanation more clear, use one!)
an implementation section that overviews the structure of your code (at a reasonably high level - should not duplicate your code)
an evaluation section that describes how you tested your library
a conclusion that summarizes your project and reflects on the assignment in general

Upload your writeup as a PDF to Blackboard no later than 48 hours after the due date for the code. You only need to submit one copy of the writeup per team. Typesetting your writeup in LaTeX is encouraged but not required. The quality of your writeup will affect your project grade -- do not neglect it!

Evaluation

Your project will be graded on program correctness, design, and style, as well as the quality of your project writeup. Remember that the autograder will only check the correctness of your program, nothing else!

You can (and should) consult the Coding Design & Style Guide for tips on design and style issues. Please ask if you have any questions about what constitutes good program design and/or style that are not covered by the guide.