Project 6 - The Bowdoin Shell

This project should be completed individually or in groups of up to 3. Groups should only complete and turn in a single program.

This project will help you understand the inner workings of the shell, a program that you have been using all semester. You'll do this by writing your own simple shell program that supports Unix-style job control. Implementing your shell will teach you the core concepts of process control and signaling, as well as give you experience with low-level system programming in C.

Start by reading through the entire project description!

Project Dates

Assigned	Saturday, April 30.
Due	Friday, May 12 (firm, no flex days).

Project Overview

A shell is an interactive command-line interpreter that runs programs on behalf of the user. At a high level, a shell repeatedly prints a prompt, waits for a program name and command-line arguments on stdin, then carries out some action as directed by the input.

The shell program that you have been using all semester is bash (the Bourne Again Shell). Bash is only one of many shell programs, however - others include sh, tcsh, and csh. In this project, you will implement your own shell, bsh (the Bowdoin Shell).

Unix Shell Basics

As we saw in project 2, a command-line string is a sequence of text words delimited by whitespace. The first word of the string is either the pathname of an executable file (i.e., a program) or a built-in command. The remaining words are command-line arguments. If the first word is a built-in command, the shell immediately executes the command within the current shell process. Otherwise, the shell forks a child process, then executes the specified program in the context of the child. The set of all child processes created as a result of interpreting a single command (there may be mulitple, if the program itself forks) are known as a job. A job can also contain multiple child processes connected by Unix pipes (denoted in a command by vertical bars, |), which allow for passing output from one program as input into another program.

By default, a job runs in the foreground, which means that the shell waits for the job to terminate before prompting for the next command string. Thus, at any point in time, at most one job can be running in the foreground. However, if the command string ends with an ampersanad (&), then the job runs in the background, which means that the shell does not wait for the job to terminate before printing another prompt and waiting for another command string. Thus, an arbitrary number of jobs can be running in the background at a given time.

Typing the following command runs the program ls (located in the directory /bin) in the foreground with command line arguments -l -d:

bsh> /bin/ls -l -d

Note that more specifically, calling the above will execute the main function of /bin/ls with the following values of argc and argv:

argc is 3
argv[0] is '/bin/ls'
argv[1] is '-l'
argv[2] is '-d'

Alternately, typing the same command with an ampersand will run ls in the background:

bsh> /bin/ls -l -d &

Job Control

Unix shells support the notion of job control, which allows users to move jobs back and forth between background and foreground, and to change the process state (running, stopped, or terminated) of all the processes in a job. Job states can be changed via signals: typing Ctrl-Z causes a SIGTSTP signal to be delivered to every process in the foreground job. The default action for SIGTSTP is to place the process in the stopped state, where it remains until it is awakened by the receipt of a SIGCONT signal. Typing Ctrl-C causes a SIGINT signal to be delivered to each process in the foreground job. The default action for SIGINT is to terminate the process.

Unix shells also provide various built-in commands that support job control. Key commands are listed below:

jobs: List the running and stopped background jobs.
bg <job>: Change a stopped background job to a running background job.
fg <job>: Change a stopped or running background job to a running job in the foreground.
kill <job>: Terminate a job (more specifically, sends a SIGTERM signal to the job, for which the default behavior is to terminate the process).

The `bsh` Specification

The bsh shell should have the following features:

The prompt should be the string "bsh> ".
As described previously, the command string typed by the user should be either a built-in command or a program name, possibly following by command-line arguments. Programs should be executed in the context of a child process forked by the shell.
Typing Ctrl-C should cause a SIGINT signal to be sent to the current foreground job (i.e., the initial child that was forked for that job as well as any descendant processes of that child). Typing Ctrl-Z should work the same except that the signal sent is SIGTSTP.
If the command string ends with &, the job should be run in the background. Otherwise, it should run in the foreground.
Each job can be identified either by a process ID (PID) or by a job ID (JID). PIDs are positive integers assigned by the operating system when processes are created. JIDs are positive integers assigned by bsh to each job. On the command line, a JID is denoted by the prefix '%'. For instance, '%5' denotes JID 5, while '5' denotes PID 5.
bsh should support the following built-in commands:
- quit: Terminate the shell.
- jobs: List all background jobs.
- bg <job>: Restarts <job> by sending it a SIGCONT signal, then runs it in the background. The job argument can be either a PID or a JID.
- fg <job>: Restarts <job> by sending it a SIGCONT signal, then runs it in the foreground. The job argument can be either a PID or a JID.
You do not need to support pipes (|) or I/O redirection (< and >) in your shell.
Your shell should not use the sleep system call (one particular case which may tempt you to use sleep is described in the advice section).

Code Structure

To start, you have been provided with a functional skeleton of the shell. The starting code implements a number of less interesting functions (such as command line parsing and error reporting) that you should use while implementing the complete shell, allowing you to focus on the more interesting components. In particular, you are responsible for implementing each of the empty functions listed below. To give you an idea of the complexity of each function, also listed below is the number of code lines implementing each function in my reference shell (including comments):

eval: Main routine that parses and interprets the command line. [70 lines]
builtin_cmd: Recognizes and interprets the built-in commands listed above (bg and fg commands should result in calling do_bgfg as below). [25 lines]
do_bgfg: Implements the bg and fg built-in commands. [50 lines]
waitfg: Waits for a foreground job to complete. [20 lines]
sigchld_handler: Handler for SIGCHILD signals. [80 lines]
sigint_handler: Handler for SIGINT (Ctrl-C) signals. [15 lines]
sigtstp_handler: Handler for SIGTSTP (Ctrl-Z) signals. [15 lines]

Note: While the function lengths given above are fairly modest, don't be lulled into a false sense of security! System programming involves writing dense, precise, and often error-prone code, and is likely to require significant debugging time.

The single file you should modify that contains the code of your shell is bsh.c. The included Makefile will compile the shell for you. To run your shell, simply execute it:

unix> ./bsh
bsh> [type commands to your shell here]

Included Files

You have also been provided with a number of tools to help you check your work. All included files are described below:

bsh.c: The code of your shell.
bshref: The reference shell. Run this program if you have any questions about how your shell should behave. Your shell should emit identical output to the reference solution (except for PIDs, which change from run to run).
sdriver.pl: A shell driver program that executes the shell and feeds it commands and signals as directed by a trace file, then captures and displays the output from the shell.
trace{01-16}.txt: 16 trace files that you will use in conjunction with the shell driver to test the correctness of your shell. The lower-numbered trace files do very simple tests, while the higher-numbered tests do more complicated tests.
bshref.out: The output of the reference solution on all traces, for your reference. This might be more convenient than manually running the shell driver on all trace files.
myspin.c: A test program that sleeps for a specified number of seconds.
mysplit.c: A test program that forks a child, which then sleeps for a specified number of seconds.
mystop.c: A test program that sleeps for a specified number of seconds, then sends a SIGTSTP signal to itself.
myint.c: A test program that sleeps for a specified number of seconds, then sends a SIGINT signal to itself.
Makefile: Builds the shell and all test programs. Also provides useful targets for testing the shell (see below).

Use the -h flag to see the usage string for sdriver.pl:

unix> ./sdriver.pl -h
Usage: ./sdriver.pl [-hv] -t <trace> -s <shellprog> -a <args>
Options:
  -h            Print this message
  -v            Be more verbose
  -t <trace>    Trace file
  -s <shell>    Shell program to test
  -a <args>     Shell arguments

For example, you could run the shell driver on trace01.txt by typing the following:

unix> ./sdriver.pl -t trace01.txt -s ./bsh

Similarly, you could run the trace driver on the reference shell by simply substituting bsh with bshref in the command above.

More simply, you can use the included Makefile to run the driver on the trace files. To pass trace01.txt through your shell, you can just run:

unix> make test01

Similarly, to pass trace01.txt through the reference shell, you can run:

unix> make rtest01

The output of your shell from the trace files is exactly the same as the output you would have gotten from running your shell interactively, except for an initial comment that identifies the trace.

Output Formatting and Logging

As stated above, the output of your shell should exactly match that of the reference shell (except for PIDs, which will be different). This means that your shell's own messages should contain the same information in the same format as the reference shell.

Particular messages that you should look out for include the following (refer to the reference shell if you are unsure about any of the exact formatting -- trace 14 in particular exercises many of the error messages):

Running an invalid program named prog:
```
prog: Command not found
```
Running a background job or switching an existing job to the background with the given jid, pid, and cmdline:
```
[jid] (pid) cmdline
```
Running bg or fg (substitute fg for the latter) without specifying a job:
```
bg command requires PID or %jobid argument
```
Specifying an invalid pid to bg/fg:
```
(pid): No such process
```
Specifying an invalid jid to bg/fg:
```
%jid: No such job
```
Specifying something other than a pid or jid for bg/fg (substitute fg for the latter):
```
bg: argument must be a PID or %jobid
```
Job terminated by signal signum (use WTERMSIG to find signum):
```
Job [jid] (pid) terminated by signal signum
```
Job stopped by signal signum (use WSTOPSIG to find signum):
```
Job [jid] (pid) stopped by signal signum
```

Note that the above messages should always be printed -- you are welcome to add additional logging in verbose mode, but this output does not need to match that of the reference shell.

General and Function-Specific Advice

Here are some useful tips for working on your shell:

Chapter 8 of the textbook (Exceptional Control Flow) is an excellent reference for this project, particularly sections 8.4 and 8.5.
Start by carefully reading the starter code in bsh.c and making particular note of helper functions that you may wish to use -- e.g., unix_error and safe_print.
Use the trace files to guide the development of your shell. Start with trace01.txt and make sure that your shell produces output that is identical to that of the reference shell. Then move onto trace02.txt, and so forth.
Normally, when you type a program name like pwd (or ls, etc) at the shell, the shell searches a number of predefined directories looking for the pwd program (this set of directories is called the PATH environment variable). However, since bsh does not use a PATH, you will need to specify the exact location of every program you want to run that does not exist in your current directory. For example, you would need to type /bin/pwd instead of just pwd like you normally would. If you want to know the exact location of a particular program, you can use the which program, e.g., 'which pwd'.
Full-window programs like more, less, nano, vi, and emacs do strange things with the terminal settings. Don't run these programs from your shell - instead, stick with simple text-based programs like /bin/ls, /bin/ps, /bin/pwd, and /bin/echo (as well as the various test programs provided to you).
Useful system calls that you will want to use include fork, execve, getpid, waitpid, kill, setpgid, sigprocmask, and sigsuspend. Also refer to the waitpid options and status macros detailed in the class slides.

Tips for specific parts of the shell are given below.

Eval

Use the provided parseline function to parse the command line and build the argument array (argv). You should use the constructed argv to pass to execve when launching the target program. For the third parameter to execve, pass the predefined global variable environ (which is the set of "environment variables" defined in the current shell session).
When adding a new job to the job list, you must be careful to ensure that the job list is not corrupted. In particular, a nasty bug can occur when the shell forks a child, but before the parent actually adds the child to the job list by calling addjob, the child exits and is reaped by sigchld_handler. Think carefully about what would happen to the job list in this situation. This type of bug is called a race condition, as it depends on two processes "racing" during concurrent execution, and is nasty to debug as it occurs nondeterministically!

To deal with the above problem and protect the job list, you'll want to prevent your signal handlers from running until the new child job is actually added to the job list. The sigprocmask function will be very useful here.
Also related to the above issue, note that children inherit the blocked vectors of their parents, and therefore the child must be sure to unblock its signals before it executes the new program.
When you launch bsh from the regular Unix shell, your bsh shell is running in the foreground process group. If your shell then creates a child process, that child will also be a member of the foreground process group. Since typing Ctrl-C sends a SIGINT to every process in the foreground group, doing so will send a SIGINT to your shell (which is good), but also to every process that your shell created (which is bad).

To work around this, after calling fork but before the child calls execve, the child should call setpgid(0, 0), which puts the child in a new process group whose group ID is equal to the child's PID. This ensures that there will be only one process (your shell) in the foreground process group. When you type Ctrl-C, the shell should catch the resulting SIGINT and then forward it to the appropriate foreground job (or more precisely, the process group that contains the foreground job).

Signal Handlers

When you implement your signal handlers, be sure to send SIGINT and SIGTSTP signals to the entire foreground process group, using "-pid" instead of "pid" in the argument to the kill function (a negative PID other than -1 sends the signal to an entire process group). Note that sdriver.pl tests for this error.
Due to the effective concurrency of signal handlers with the rest of the program code, be careful about what code you put in signal handler functions. For example, you should not use printf in signal handlers (since weird things may happen if the signal handler is called when the program is already in the middle of executing printf). You can use the provided safe_printf function as a drop-in "safe" printf within handlers.
One of the tricky parts of the shell design is deciding on the allocation of work between waitfg and sigchld_handler - in particular, deciding where to reap child processes. A recommended approach is to perform reaping (via waitpid) entirely within sigchld_handler, and have waitfg simply pause until the specified pid is no longer in the foreground before returning. While other approaches are possible, it is simpler to do all reaping in the handler.
If you follow this suggested approach, you will need to think carefully about how to write waitfg. The easiest option is to use sleep inside a loop to periodically check that the process is still in the foreground, but this pattern is called busy-waiting and should be avoided (as it wastes CPU time and will likely wait longer than needed). Instead, you should use the sigsuspend function as a mechanism to block until a signal is received and processed (at which point you can check if the process is still in the foreground).
In general, system calls always return -1 and set the global errno variable if an error occurs (which you can easily access using the unix_error function). However, there's one special case to be aware of, which is that if waitpid has no remaining children to wait on, then it will return -1 and set errno to the value ECHILD. Importantly, this is *not* an actual error, despite waitpid returning -1. Assuming you are checking your return values for errors (which you should!), you may need to filter out this condition.

Using GDB in a Multi-Process Program

To debug a multi-process program such as your shell using gdb, you will need a few extra commands:

(gdb) set detach-on-fork on/off

The above command sets whether child processes will be detached when fork is called. The default is on (i.e., the child runs without any interruption). If you turn this option off, the child is suspended as soon as it is forked. Then, you can use the inferior command to switch between the various processes started by the shell:

(gdb) info inferiors
... listing of processes ...
(gdb) inferior 1
[Switching to inferior 1 [process 0] (<noexec>)]

Another useful option is the following:

(gdb) set follow-fork-mode parent/child

The above sets which process gdb will automatically follow (either the parent or the child -- parent is the default) after fork is called.

Logistics

You are responsible for completing the contents of the bsh.c file. You should not modify any other file. You are responsible for ensuring that your program runs on the class server, regardless of where else you may be writing code. Since other systems may have the same system calls but with slightly different behavior, you are strongly urged to develop your code entirely on the class server.

As usual, your final submission will consist of your committed bsh.c file at the time of the due date. Each group need only make one submission.

warning Note for groups: If you are working in a group, you must individually send me an email giving me a brief description of your and your group member's contribution to the project after your submission. The purpose of this requirement is to encourage full group participation on the project. Your email does not need to be detailed, but your project is not considered complete until your summary is received. Your email will be kept confidential by me and not shared. However, in the case of a clearly inequitable distribution of work, I may adjust individual grades up or down from the group score.

Evaluation

You will be evaluated both on the correctness of your shell implementation (as determined by the 16 trace files) as well as the style and overall quality of your program (as determined by me).

Particular things to watch out for:

For full credit, your program must compile without any warnings on the class server.
Your program must also check the return value of every system call.
Your program must not use the sleep system call.
Your shell will be tested on the class server, where it should product identical output to the reference shell, with two exceptions:
- The PIDs can (and will) be different.
- The output of the /bin/ps commands in traces 11, 12, and 13 will be different from run to run (since the ps program displays PIDs, among other things). However, the running states of any mysplit processes in the ps output should be identical.

In addition, you should follow all standard and sensible style guidelines, such as using good variable names, commenting, using consistent indentation, etc. While you do not have to adhere to any particular set of conventions (e.g., where to put parentheses), you SHOULD be consistent with whatever conventions you choose. If you are working in a group, agree on a single set of conventions and stick to them!

You do not need to define any functions beyond those already specified in bsh.c, but you are welcome to do so if you wish.

Please ask if you have any questions about what contitutes good style or what is expected!