CSCI 2330
Foundations of Computer Systems

Bowdoin College
Spring 2024
Instructor: Sean Barker

Lab 6 - The Bowdoin Shell

Release Date:Monday, April 29.
Acceptance Deadline:Wednesday, May 1, 11:59 pm.
Due Date:Wednesday, May 8, 11:59 pm.
Collaboration Policy:Level 1
Group Policy:Pair-optional (you may work in a group of 2 if you wish)

In this lab, you will write a shell program that supports Unix-style job control. Doing so will help you understand the core concepts of process control and will also touch on the challenges of concurrent programming. You will also gain experience working with system calls and signals to interact with the operating system.

Lab Overview

A shell is an interactive command-line interpreter that runs programs on behalf of the user. At a high level, a shell repeatedly prints a prompt, waits for line of input to be entered, then carries out some action as directed by the input (e.g., launching a program with the specified command-line arguments).

The shell program that you have been using all semester is bash (the Bourne Again Shell). Bash is only one of many shell programs, however; others include sh, tcsh, zsh, and csh. In this lab, you will implement your own shell: bsh (the Bowdoin Shell). Your shell will support launching and managing multiple jobs in many of the same ways that real shells do.

Unix Shell Basics

As we saw in Lab 2, a command-line string is a sequence of text words delimited by whitespace. The first word of the string is either the filename of a program (e.g., ls or ./myprogram or /full/path/to/myprog) or a built-in shell command (e.g., jobs). The remaining words are command-line arguments. If the first word is a built-in command, the shell immediately executes the command within the current shell process. Otherwise, the shell forks a child process, then executes the specified program in the context of the child. When a new program is executed, we refer to the new process as a job. Note that if the new process itself forks, then any new processes it creates will be considered part of the same job (so a job may ultimately contain more than one process).

By default, a job runs in the foreground, which means that the shell waits for the job to terminate before prompting for the next command string. Thus, at any point in time, at most one job can be running in the foreground. However, if the command string ends with an ampersand (&), then the job runs in the background. A background job means that the shell does not wait for the job to terminate, but instead immediately prints another prompt and allows for another command string. As a result, an arbitrary number of background jobs can be running at a given time, in addition to at most one foreground job.

For example, suppose we want to run the ls program with the two command-line arguments -l and -d. To specify the program name, we can either specify the full path to the program (which is /bin/ls here) or we can just type ls and let the shell locate the program in known program directories (which typically includes the /bin directory). The set of directories to search is called the PATH; for the purpose of this assignment, we will assume that the PATH is just the /bin directory. We can run the desired job in the foreground by typing the following command, assuming the shell prompt is bsh>:

bsh> ls -l -d

Specifically, entering the above will execute the main function of the /bin/ls program with the following values of argc and argv:

Alternately, typing the same command with an ampersand will run ls in the background:

bsh> ls -l -d &

Job Control

Unix shells support the notion of job control, which allows users to manage the job state of each active job. In particular, active jobs can be in one of three states:

  1. Foreground: The job is running in the foreground (i.e., as an interactive job). At most one job may be in this state at any given time.
  2. Background: The job is running in the background. Any number of jobs may be in this state.
  3. Stopped: The job is stopped (aka suspended). Stopped jobs do not continue executing until they are moved back into the foreground or background state.

Job states can be changed using signals, which can be triggered either by keyboard commands or by invoking various built-in shell commands that support job control. In particular, we are interested in two specific keyboard commands:

We are further interested in the following three built-in shell commands involved in viewing and managing jobs:

Jobs can be specified to fg and bg either using a process ID (PID) or a job ID (JID). PIDs are positive integers assigned by the operating system when processes are created. JIDs are positive integers assigned by the shell itself to each job. By convention, a JID is denoted by the prefix % to distinguish it from a PID. For example, fg %5 says to run the job with JID 5 in the foreground, while fg 5 says to run the job with PID 5 in the foreground.

To summarize, the following active job state transitions are possible through the actions indicated:

Code Overview

To start, you have been provided with a functional skeleton of the shell. The starting code implements a number of less interesting functions (such as command line parsing and utility methods for manipulating the job list) that you should use while implementing the complete shell, allowing you to focus on the more interesting components.

The only file you should modify is bsh.c, which contains skeletons of every function that you need to complete. You do not need to define any functions beyond those already specified in bsh.c, but you are welcome to do so if you wish.

A summary of the functions that you must implement is given below.

You should not modify any included functions other than those listed above.

Compile your shell using the included Makefile by running make. Then, to run your shell, simply execute it:

$ ./bsh
bsh> [type commands to your shell here]

Note that the bsh> command prompt indicates that you are within bsh rather than the standard system shell bash. You can exit out of bsh by typing Ctrl-D, which indicates that there is no more input, and will cause the shell to exit. You can alternately use the built-in quit command once you get that working (which will be one of your first tasks).

Included Files

You have also been provided with a number of test programs and tools to help you test your shell. All included files are described below:

Use the -h flag to see the usage string for sdriver.pl:

$ ./sdriver.pl -h
Usage: ./sdriver.pl [-hv] -t <trace> -s <shellprog> -a <args>
Options:
  -h            Print this message
  -v            Be more verbose
  -t <trace>    Trace file
  -s <shell>    Shell program to test
  -a <args>     Shell arguments

For example, you could run the shell driver on trace01.txt by typing the following:

$ ./sdriver.pl -t trace01.txt -s ./bsh

Similarly, you could run the trace driver on the reference shell by simply substituting bsh with bshref in the command above.

More simply, you can use the included Makefile to run the driver on the trace files. To run trace01.txt using your own shell, you can just run:

$ make test01

To run trace01.txt through the reference shell, you can similarly run:

$ make rtest01

The other traces can be run in the same way (e.g., make test02 and make rtest02). The output of your shell from the trace files is exactly the same as the output you would get from running your shell interactively, except for an initial comment that identifies each trace and a few values that might vary from run to run (e.g., pids).

Output Formatting

Your shell's output formatting should exactly match that of the reference shell. For example, the command-line prompt of your shell should be the precise string "bsh> ", and your output messages should contain the same information in the same format as the reference shell. Particular messages that you should look out for (and their correct formatting) include the following:

Note that the above messages should always be printed. You are welcome to add additional output when running in verbose mode, but your verbose output does not need to match that of the reference shell. Refer to the reference shell if you are unsure about any of the exact formatting of these messages. In particular, trace 12 exercises many of the error messages.

Shell Trace Files

Each trace file consist of a series of commands to test the functionality of your shell. The trace files are understood by the sdriver.pl driver program, which launches your shell, executes each line of the trace file via your running shell process, and captures the shell's output. The format of the trace files is described below:

The external programs launched by the trace files include the provided programs myspin, mysplit, mystop, and myint, as well as the external system programs echo (i.e., /bin/echo) and ps (i.e., /bin/ps). The ps program prints process information and is discussed later. The echo program simply prints out a message specified on the command line (i.e., it "echoes" a message back to you). Starting in trace03.txt, the echo program is used simply as a way to print out the non-echo commands that the shell is about to execute prior to actually doing so. For example, consider trace03.txt, which is reproduced below:

#
# trace03.txt - Process jobs builtin command.
#
echo -e bsh> ./myspin 2 \046
./myspin 2 &

echo -e bsh> ./myspin 3 \046
./myspin 3 &

echo bsh> jobs
jobs
In this trace file, the echo program will just print out the two myspin commands and the jobs command before they actually execute. Thus, these echo commands serve to give a visual indication of what the traces are doing as they run. These echo commands launch foreground jobs like any other, but assuming you have passed trace02.txt already, foreground jobs should already be working.

Note that some of these echos include the character sequence \046. This sequence is just a way to specify the ampersand character & as a character that echo should print (the value \046 is the ASCII code of an ampersand in base 8). Using this character sequence is necessary because if an actual ampersand were used, then the echo job would be run in the background. Since these echoes should execute in the foreground (before the following command actually runs), this character sequence is needed to actually output an ampersand.

Listing Process Info with ps

Traces 9, 10, and 11 use the ps program, which lists active process information (from the OS, not from your shell). The ps program will show a variety of process info (one process per line) but the column of particular interest is the STAT (state) column, which is shown when executing ps w (as in the trace files). Note that ps shows different output in a variety of formats depending on the options passed, so to avoid confusion, stick to executing ps w only. A number of different state codes may be displayed in the state column, but the only ones you really need to pay attention to are the following three:

Make sure that these process state codes match as shown in your shell and the reference shell. Note that the output of ps will show all of your processes, including processes unrelated to your shell. The only processes you should concern yourself with in the output of ps are the processes created by your shell process (in particular, the mysplit processes).

Implementation Advice

Here are some useful tips for working on your shell:

Tips for specific parts of the shell are given below.

Eval

Signal Handlers

Logistics

As usual, initialize your lab repository on GitHub via the invitation link posted to the Slack, then clone to the class server to begin working. You are responsible for completing bsh.c, but should not create or modify any other file.

If you are working in a group and have not previously done so, it is a good idea to go through Part 3 of the Git tutorial, which covers some specific topics applicable to collaboration (most significant of which is handling merge conflicts). You should also review the course policies on group work.

Your final submission will consist of your committed and pushed bsh.c file at the time of the due date. Remember to submit your individual group reports to me if you worked in a group.

Evaluation

Your shell will be graded on program correctness (as determined by the 14 trace files), design, and style. The output of your shell on the trace files should be identical to that of the reference shell, with two exceptions:

You can (and should) consult the Coding Design & Style Guide for tips on design and style issues. Please ask if you have any questions about what constitutes good program design and/or style that are not covered by the guide.