Using a Version Control System (SVN version)

Note: this document is historical, not being maintained, and the server referenced is no longer operational. This tutorial has been replaced by the Git Essential Training companion guide.

An important tool in any programmer's toolchest is a good version control system (VCS). A VCS is used to track changes to documents and manage multiple users' access to the same files. Here at Bowdoin, you should mostly care about version control in the context of working on a program in a group. Conventional ways of working in a group include (1) having one person do all the coding, (2) emailing or otherwise sending the updated program files between group members every time someone makes changes, or (3) always sitting in front of the same computer together when working. All of these options are either inefficient or limiting, and using a VCS is a much more flexible approach.

git As with text editors, there are several widely-used VCSes to choose from. The two most common options are Git and Subversion (aka svn). Git is newer, fancier, and more popular, but also more complex and somewhat prone to arcane incantations (see right). This tutorial will use the simpler Subversion, but the basic concepts of version control (revisions, updating, committing, etc) are similar across systems.

The basic setup of Subversion is shown in the figure below. A single Subversion server (which is generally a computer located elsewhere) holds the "master" copies of the files that we wish to collaborate on. Users (or "clients") never actually modify the master files directly -- instead, they maintain a local copy of all the files that they can edit at will. Periodically, clients send (or "commit") their updated files to the server, which informs the server that the master copy of the files should be updated appropriately. Clients will also periodically ask the server to send them all file updates that have been sent to the server in the meantime (this is called "updating"), and will therefore stay in sync with other clients using the same files. The important point here is that at any given time, there can be multiple copies of the same files -- the "authoritative" copies are located on the server, and the "working" copies are located on the client machines. The clients share their edits with each other via the server through updates and commits. As long as clients are regularly committing their changes and downloading others' changes via updates, the clients' local copies will remain mostly in-sync.

svn

For example, suppose Client 1 and Client 2 both have local copies of a file, foo.txt, that is being shared by the Subversion server. Client 1 updates its local copy of foo.txt. At this point, Client 1 has a different version of the file from both the server and Client 2. Once finished, Client 1 commits its changes to the repository. At this point, Client 1 and the server both have the updated version of the file, but Client 2 still has the old version. Some time later, Client 2 issues an update request to the server. The server notices that the master copy of foo.txt is more recent than Client 2's local copy, so the new version (which was changed by Client 1) is sent to Client 2. Both clients are now again in sync with each other, despite never working on each other's files directly.

As an aside, the most significant difference between Subversion and Git is that while Subversion is centralized (i.e., reliant on the central server), Git is distributed and does not require a connection to a central server -- i.e., the entire repository exists on client machines. There are important other differences as well (including different command syntax), but we won't cover them there (many Git tutorials can be easily found via Google).

Let's try out Subversion using Bowdoin's Subversion server. First cd back to your home directory. Now, we want to create a local copy of the remote file repository that we can view and edit. This is called checking out the repository, and is accomplished using the command svn checkout, as such:

svn checkout https://repo.bowdoin.edu/svn/sbarker/public my-checkout-dir

This command says to checkout the repository located at the above URL into a directory named my-checkout-dir. Enter the URL exactly as shown above (using sbarker in the name) - this is part of my own repository that is publicly accessible for this exercise. When prompted, enter your Bowdoin credentials (probably answer 'no' if you get a warning about storing your password unencrypted), and you should be able to checkout the directory. Doing so will create a directory called my-checkout-dir in your current working directory (named since that was the lowest-level directory of the URL that was checked out). This new directory is your local copy of the file repository.

Important: Checking out out a repository is normally a one-time action that you do when first getting set up. While updates and commits happen often while working, you should not normally need to use the checkout command more than once (unless you are starting to work with a new repository).

Cd into your new checked-out repository. At any time, you can use the svn status command to show whatever changes you have made to your local copy of the repository that you haven't yet committed back to the server. Try running that now -- since you just checked out the repository, it should not show you anything.

Now let's try adding a new file to the repository. Create a text file and store some text in it (remember that anything you put here will be viewed by everyone else checking out the same repository!). Run svn status again and it should show that there is a file in your local repository that isn't committed (indicated by the question mark). First let's mark that file to tell Subversion that we want to add it to the repository:

svn add [filename]

After adding the file, if we run svn status again, the output shows that we have marked the file for addition to the repository. However, we still haven't actually committed our changes to the repository. For that, we need to run the svn commit command.

svn commit -m "added a new file"

This command says to actually push our local changes out to the remote repository. The -m flag is used to pass a 'commit message' (basically like a comment describing the commit). You can use any descriptive text for the commit message. If you don't specify a commit message, svn will dump you into an editor window, which unfortunately (for beginners) is vim by default. If this happens, to quit vim, type :wq and press enter.

In general, before you ever commit, you should update (i.e., pull down other client's changes to the repository) by calling svn update:

svn update

Try updating now. If anyone else has committed any changes to the repository between the time you checked out the repository (or your most recent svn update) and now, those changes will download and be applied to your local copy. Rule of thumb is to update frequently, as it minimizes the chance that you will accidentally try to apply conflicting changes (in which case Subversion will require you to make decisions about how to resolve the conflict -- i.e., whose changes should stick).

Now that you've added a new file, let's try changing an existing file. First, do another status to check that you have no outstanding changes. Since you just committed your new file, you should no longer have any changes and the status should output nothing. Now, let's edit the file you just added. Open it up in an editor and change its contents. Now do another svn status and you should see that you have pending modifications to your file. Do another commit to push these changes out to the repository.

Subversion (and other VCSes) have many more capabilities, such as undoing recent changes and rolling back your files to their state at the time of earlier commits, but the essential components are checking out (one-time operation), updating, and committing.

Note: When you're deleting or moving files that are part of a Subversion repository, don't use the regular rm or mv utilities -- this will confuse Subversion when it sees that the files it was expecting aren't there. Instead, use the Subversion commands svn rm and svn mv, exactly as you would with the regular Unix commands. As with every other type of change you make to a Subversion repository, you'll need to do a commit before these changes are reflected in the master copy.

Finally, one more reason to use Subversion or another VCS is backups! Since multiple copies of your files exist on different machines when you use a VCS, it is much more unlikely to lose data due to hardware failure, accidental deletion, etc. I keep all of my important files in a Subversion repository, even those on which I am the only user.