Issue #10 August 2005

CVS is out, Subversion is in

Introduction

In case no one happened to tell, you, CVS is dead. Bereft of life, it rests in peace. Oh, sure, people still use it, and it is still included in most Linux distributions, including Fedora™ Core, but it is quite dead. It died after a long, drawn-out sickness after years of neglect. Sadly, it died of the incurable disease 'broken architecture.' Nothing could be done besides making its final days (well, years) as comfortable as possible. But now, finally, gone it is, and its replacement is a much younger, much healthier, much better architected, and much more capable version control system—Subversion.

In a world where you can buy hundreds of gigabytes of storage for less than a hundred dollars, is it really necessary to have a complex version control system at all? After all, you can just make copies of the file you're changing and use the diff command to look at old versions, right? Well, you can, but hopefully by the end of this article you will not only see the use of version control in general, but why you absolutely, positively must be using Subversion to manage all of your own files.

Although the name might imply otherwise, Subversion is a version control system that will feel fairly comfortable to anyone with CVS experience. It is not a drastic change to a whole new paradigm of version control, nor is it an avant garde tool that revolutionizes command line version control. No, although it is neither of those, Subversion is most definitely an important version control tool, and, unless you need some of the more specialized features of other modern version control software, it is the one you should reach for by default. Again, CVS is dead.

Billed as a better CVS, Subversion is aimed at centralized, client-server version control much like CVS, Perforce, and Visual SourceSafe. It began with the intentions of meeting feature parity with most of CVS (most, meaning the few areas where it diverges, it diverges for good reasons) while having a cleaner and more extensible codebase to act as a launching pad for more innovative features in later versions. As we shall see, the Subversion team more than delivered on this promise.

Concepts

Like CVS, Subversion has a concept of a single, central repository (often residing on a dedicated server) that stores all data about the projects you are working on. This data is called a repository, and it is best thought of as the ultimate source of truth and history for your work. It knows about every change you have ever committed and can instantly take you back and forth in time to inspect those changes and build further upon them. You never work in the repository directly, though. Instead, you pull subsets of it into working copies that typically reside on other systems such as your desktop computer. In these working copies, you make your changes, and when you are pleased with them, you commit those changes into the central repository where they become once and forever part of history.

Each commit (also called a check-in) to the repository is called a revision, and, in Subversion, revisions are numbered. A commit can be a change to one file or a dozen, to directories, or to metadata (which we'll discuss shortly). The first change you make to your repository is revision 1; predictably, the second is revision 2, and so forth. In addition, we speak of HEAD when we mean the latest version of the repository; so, when you check in revision 17, then HEAD is revision 17, but when you check in revision 18, then HEAD is revision 18. Whether you change one file or a hundred files, if the changes you make are part of a single commit, then they become a single revision. In addition, suppose you are in the middle of a commit and your network switch catches on fire, your desktop is struck by lightning, or you hit Ctrl-C. In Subversion, a commit is an atomic operation, meaning it either succeeds entirely or fails entirely; unlike CVS, you can't end up with half of your files saved to the repository but the other half unchanged.

You can also undo a change you've made, either manually (say, deleting a line you mistakenly added to a file) or by asking Subversion something akin to 'take the change association with revision 13117, reverse it, and apply it to my working copy.' When you commit that change, however, the revision number does not go down; to Subversion, it is just another change (even if it undid a previous one), and so the revision number is a simple increment. So time marches forever, signified by revision numbers, always forward, never backwards. In a way, you can think of the revision numbers like important events on a timeline; while it may be a week between revision 7 and 8, or revisions 100 through 150 may take place in a single minute, you are guaranteed revision 8 came after revision 7 and no change occurred in between. In fact, if you want to undo a change and absolutely must remove it from the repository (say, you accidentally committed a plain text file with a password in it—bad, bad, bad!), you must go to great lengths to banish such a file from the repository. So such a thing is possible, but difficult. (Just asking Subversion to remove it from the latest change in the repository isn't enough—Subversion, after all, lets you time travel, and it is relatively easy to ask for yesterday's copy of the file, even if it has been deleted today).

Not only does Subversion offer version control of files and directories, it also offers version control of metadata. In essence, metadata is data about data. In the world of Subversion, such metadata is called a property, and every file and directory can have as many properties as you wish. Changing a property, just like changing a file, requires a commit to the repository. Metadata like this is commonly used for indicating if a file is binary or text (not an easy thing to do in an automated fashion in a world of UTF-8 and other character encodings), whether it has Windows, UNIX, or old-style Mac line endings, etc. In addition, you can define your own metadata for your files to indicate, say, where a file originally came from, what kind of processing it might need, or anything else you can envision. Once you are in the mode of thinking about metadata and file properties, you begin to see a myriad of uses for them. Subversion's versioning of this metadata is especially powerful.

My first repository

Enough theory; let's actually take Subversion for a test drive. Unless you are accessing someone else's repository, the first thing you will want to do is create a repository. For our purposes, we will simply make one in your home directory. If you don't have Subversion installed, run yum install subversion as root.

To create the repository, execute the following command ($HOME/snvrepo will be the server location of the repository, not the location of your working copy):

svnadmin create --fs-type fsfs $HOME/svnrepo

Simple as that—no output means everything went fine. The usage is quite simple—svnadmin create PATH. We add the --fs-type fsfs in case of older versions of Subversion, but as of 1.2, fsfs is the default file system type (don't worry, this doesn't matter for typical use; suffice it to say, as we will see later, a Subversion repository is effectively just a versioned, user-land file system).

Although administration commands are performed with the svnadmin command, the majority of the time, you will simply use the svn command to manage your repository. Now that we have a repository, we need to create a working copy—the server repository directory $HOME/svnrepo is best thought of as an opaque directory that we generally won't need to manipulate. To create your working copy, check out the repository with the following command:

svn checkout file://$HOME/svnrepo $HOME/checkout

If you see the following output, the check out was successful:

Checked out revision 0.

It creates a (seemingly) empty directory called checkout in your home directory. However, if you issue the ls -la command in the checkout directory, you will see:

drwxrwxr-x    3 cturner cturner 4096 Aug  8 19:29 ./
drwxr-xr-x  125 cturner cturner 4096 Aug  8 19:29 ../
drwxrwxr-x    7 cturner cturner 4096 Aug  8 19:29 .svn/

Ah, not quite as empty as first glance might tell us. If you have used CVS, you are no doubt familiar with CVS/ directories inside of every version controlled directory. The .svn/ directory is analogous to that, though since the name begins with a period, it is hidden from ls (and, more practically, from wildcard expansion such as ls *).

Let's create a file. First, create a simple file with the echo command:

echo 'my first repository' > README

Then use the command svn status to check the status of the new file, and you will see the following output:

?      README

The svn status command, in this context, asks Subversion to tell us what it knows about various files in comparison to what the server knows. In the first invocation, it is saying it knows absolutely nothing about the file (denoted by a ? in the first column); this means no file named README is in HEAD of the repository, which is what we expect as this is an empty repository. Once we run svn add README though, the story is different, as svn status shows us:

A         README

In this case, A means the file has been added to our working copy, but not yet checked in. In general, svn status will only show us lines of output for changes in our working copy.

Let's go ahead and commit our single file:

svn commit -m 'my first file!'

Adding the file produces the following output:

Adding         README
Transmitting file data .
Committed revision 1.

Performing an svn update shows:

At revision 1.

Generally, a commit is simply svn commit. Subversion will then pop up your editor of choice (as defined by the EDITOR environment variable variable) for you to describe your check-in—here you generally leave a message for posterity, describing the change, why it was needed, and perhaps even referencing a bug tracking number. For the sake of an easily read article, though, we include that message on the command like via the -m option.

Notice that we performed an svn update after our commit. This is necessary for the next step. Generally speaking, even though our commit created revision 1, our repository was last synced at revision 0. This means we need to ask the server for any changes since our checkout (or the last time we synced our repository). We do this with a simple svn update command.

Let's view our history with the svn log command:

------------------------------------------------------------------------
r1 | cturner | 2005-08-08 19:55:34 -0700 (Mon, 08 Aug 2005) | 1 line

my first file!
------------------------------------------------------------------------

There it is, our change along with our check-in message. To see what files were changed, though, we add two options:

svn log -v -r 1

which gives the output:

------------------------------------------------------------------------
r1 | cturner | 2005-08-08 19:55:34 -0700 (Mon, 08 Aug 2005) | 1 line
Changed paths:
   A /README

my first file!
------------------------------------------------------------------------

The -v option tells Subversion to be verbose, which, in the case of svn log, means to list the files changed (the leading /in /README indicates our change was at the root of our repository). The -r 1 parameter tells Subversion to give us just the changes for revision 1, not all changes like svn log defaults to. Generally you want to combine -r # with -v so you don't end up with page after page of changes scrolling by. Likewise, you can do svn log -v -r HEAD instead of the numeric revision to see the latest change.

Getting fancy

The above is enough to create files, edit files, and generally be productive ad a basic level, but Subversion offers much more. First and foremost, Subversion will version control directories. This means, unlike CVS, adding and removing directories are part of the repository history:

[gandalf@moria checkout]$ mkdir src
[gandalf@moria checkout]$ echo 'first file' > src/file1.txt
[gandalf@moria checkout]$ echo 'second file' > src/file2.txt
[gandalf@moria checkout]$ svn status
?      src
[gandalf@moria checkout]$ svn add src/
A         src
A         src/file1.txt
A         src/file2.txt
[gandalf@moria checkout]$ svn status
A      src
A      src/file2.txt
A      src/file1.txt
[gandalf@moria checkout]$ svn commit -m 'add some source files'
Adding         src
Adding         src/file1.txt
Adding         src/file2.txt
Transmitting file data ..
Committed revision 2.
[gandalf@moria checkout]$ svn update
At revision 2.
[gandalf@moria checkout]$ svn log -r 2 -v
------------------------------------------------------------------------
r2 | cturner | 2005-08-08 20:09:15 -0700 (Mon, 08 Aug 2005) | 1 line
Changed paths:
   A /src
   A /src/file1.txt
   A /src/file2.txt

add some source files
------------------------------------------------------------------------

As simple as that, we've made a directory, added it to our working copy, and committed it. Now let's change a file that we already have created (which, generally, is a more common operation; after all, files are only created once, but edited many times). After changing the contents of the file src/file1.txt, svn stat shows us that it has been modified:

M      src/file1.txt

To commit it:

svn commit -m 'replace file1 with new content'

which produces the output:

Sending        src/file1.txt
Transmitting file data .
Committed revision 3.

and svn up produces:

At revision 3.

Note that this time we have shortened svn status to simply svn stat and svn update to just svn up. svn offers a number of abbreviations, which are visible via svn help, which will list all of the commands svn supports as well as abbreviations in parentheses after each command.

One thing that may differ from other version control systems you've used is that you did not have to explicitly check a file out for editing or otherwise mark it as being modified—you just edit the file. Also notice that, this time, svn stat showed us the M state. This means the file has been locally modified. Let's explore this change further, though. Subversion not only lets you see the reasoning behind each change and the list of changed files, but it also lets you see the actual change with the svn diff command. In our case, we wish to see the changes that occurred in going from revision 2 to revision 3:

svn diff -r 2:3

which produces:

Index: src/file1.txt
===================================================================
--- src/file1.txt       (revision 2)
+++ src/file1.txt       (revision 3)
@@ -1 +1 @@
-first file
+this is the new file1

The output is a unified diff of the files that have changed between revisions 2 and 3; in our case, only one file changed (src/file1.txt) and the change replaced the one and only line in the file. If we omitted the :3 and just executed svn diff -r 2, then svn would perform the diff between revision 2 and whatever revision we had most recently synced in our working copy. We can also view more changes at once if we wish—we just execute svn diff -r M:N where M is less than N. The result, again, is a diff, this time representing all changes between revision M and N. When you are editing your working copy, svn diff (without the -r parameter) will show a diff between your working copy and the version of the repository you last synced to (note, this isn't against the latest version in the repository—for that, just svn up and svn diff again).

Let's explore our first-ever change with this new tool and see how it looks. svn diff -r 0:1 produces:

Index: README
===================================================================
--- README      (revision 0)
+++ README      (revision 1)
@@ -0,0 +1 @@
+my first repository

This says 'give us the change between revision 0 and 1' which is simply us adding the README file. One limitation of this view of a diff is that it isn't obvious if the file was present before and empty, or if it never existed—the diff simply looks like it added a line to the file. However, svn log shows us the truth.

Suppose we decide, though, that our original README should be named README.txt. If we were using CVS, we would be forced to delete README and create a new file, README.txt from the previous file's contents. This loses the history of the file, though. In Subversion, though, we have full control. The command svn mv README README.txt produces:

A         README.txt
D         README

And, svn stat produces:

A  +   README.txt
D      README

There are two important things here. First, to Subversion, a rename looks almost like an addition (represented by the A change for README.txt) and a deletion (represented by the D change for README). The only difference is the + next to the A which, in this case, makes all the difference. When we do an svn mv or an svn cp, Subversion will actually copy the history and metadata of the file with it.

Also worth noticing is that we did not run a bare mv on the file ourself. Subversion changed our working copy for us. Likewise, when we use svn cp, Subversion will copy the file for us (preserving history and metadata) so that we don't have to. Committing with the command svn commit -m 'rename README -> README.txt' produces:

Deleting       README
Adding         README.txt

Committed revision 4.

svn up produces:

At revision 4.

And, svn diff -r 3:4 produces:

Index: README
===================================================================
--- README      (revision 3)
+++ README      (revision 4)
@@ -1 +0,0 @@
-my first repository
Index: README.txt
===================================================================
--- README.txt  (revision 0)
+++ README.txt  (revision 4)
@@ -0,0 +1 @@
+my first repository

This is somewhat troubling, though. Notice that according to the message with commit and the diff, it looks like we just completely removed the README file and added a new file called README.txt. svn log -v -r 4 however shows us something different:

------------------------------------------------------------------------
r4 | cturner | 2005-08-08 20:27:02 -0700 (Mon, 08 Aug 2005) | 1 line
Changed paths:
   D /README
   A /README.txt (from /README:3)

rename README -> README.txt
------------------------------------------------------------------------

Notice the (from /README:3) next to the A line. This means Subversion copied the history and metadata of the file, basing the new file on the old. We can also see this with a variant svn log README.txt that shows us the sordid history of a single file:

------------------------------------------------------------------------
r4 | cturner | 2005-08-08 20:27:02 -0700 (Mon, 08 Aug 2005) | 1 line

rename README -> README.txt
------------------------------------------------------------------------
r1 | cturner | 2005-08-08 19:55:34 -0700 (Mon, 08 Aug 2005) | 1 line

my first file!
------------------------------------------------------------------------

Notice that although there was no file called README.txt in revision 1 (r1), log shows it to us as part of the history for README.txt.

This is an example of an important concept to remember. Sometimes, a change is not easily represented for human consumption. Often, we are used to looking at changes in terms of diffs of files. Some changes, though, such as renames or metadata changes do not represent themselves well as diffs. So even though in some ways it looks like Subversion lost the fact that README.txt was once README, this is actually just an artifact of how we are looking at the changes. Rest assured, Subversion is doing the right thing internally.

Let's take renames a bit further, well beyond anything CVS might let us do—let's rename a directory! Using the command svn mv src text-files produces:

A         text-files
D         src/file2.txt
D         src/file1.txt
D         src

which gives the following output for svn stat:

A  +   text-files
D      src
D      src/file2.txt
D      src/file1.txt

Now, we have to commit the directory name change:

svn commit -m 'rename src to text-files'

which produces:

Deleting       src
Adding         text-files

Committed revision 5.

Issuing svn up produces:

At revision 5.

There is one major difference this time, and that is even though we performed svn mv on the directory, it remained until the commit took place. This is simply Subversion's record keeping (even though src/ is empty of our files, it still has the .svn/ directory) and not actually a problem.

Now let's make a change, but abort before we commit. Suppose in a moment of anger, we execute the svn rm * command.

Oh no! Our working copy is empty! Remember, though, this is just a working copy; until we perform an svn commit, nothing has changed in the server (though as we will soon see, even if it had, we could undo it). We have two options. One is to blow away our working copy and start anew with a fresh checkout. This works, but there is a more elegant option for a more civilized system such as Subversion:

svn revert -R .

produces:

Reverted 'text-files'
Reverted 'text-files/file2.txt'
Reverted 'text-files/file1.txt'
Reverted 'README'

And now svn up shows us:

At revision 5.

Voila! Not only are our files back, but as you can see from svn up, we didn't change the repository (which is still at revision 5 from our previous change).

Alas! svn revert only works when you have yet to check in a change. If you realize a mistake after a commit, you must do something else. In our case, let us suppose we did made such a mistake—we should never have renamed README into README.txt. We need to undo that change. We have two options. One is we simply svn mv README.txt README and commit. That will work fine, and Subversion will DTRT (do the right thing) and preserve history and metadata. But suppose our change were one over hundreds of files in dozens of directories...that could be tedious to fix by hand. Fortunately, unlike in real life, with Subversion we can easily undo our past sins. First, we find the change we wish to undo with svn log:

------------------------------------------------------------------------
r5 | cturner | 2005-08-08 20:32:29 -0700 (Mon, 08 Aug 2005) | 1 line

rename src to text-files
------------------------------------------------------------------------
r4 | cturner | 2005-08-08 20:27:02 -0700 (Mon, 08 Aug 2005) | 1 line

rename README -> README.txt
------------------------------------------------------------------------
r3 | cturner | 2005-08-08 20:12:39 -0700 (Mon, 08 Aug 2005) | 1 line

replace file1 with new content
------------------------------------------------------------------------
r2 | cturner | 2005-08-08 20:09:15 -0700 (Mon, 08 Aug 2005) | 1 line

add some source files
------------------------------------------------------------------------
r1 | cturner | 2005-08-08 19:55:34 -0700 (Mon, 08 Aug 2005) | 1 line

my first file!
------------------------------------------------------------------------

Ah there it is. The change from revision 3 to revision 4. Now we use the svn merge -r 4:3 command to merge in the change we wish to undo:

D    README.txt
A    README

svn stat shows that the merge is set to take place:

D      README.txt
A  +   README

The last step is to commit the change with svn commit -m 'undo change 3:4' which produces:

Adding         README
Deleting       README.txt

To confirm, svn up shows:

At revision 6.

A few interesting points—first, we specified 4:3, not 3:4. This actually makes sense as it is the change from revision 3 to revision 4 we wish to undo, so we specify them in reverse order. We can also specify 'backwards' revisions like this when viewing diffs, should we find the need. Second, the change looks identical to what we would see with just performing an svn mv. Although internally Subversion is being smart about the file's metadata and contents, in actuality reverting this particular change is simply an svn mv.

Conclusion

Hopefully our whirlwind tour of Subversion has left you with an understanding of the power of version control in general and of Subversion in particular. If you are a CVS user, you hopefully noticed two key things. One, that the command line usage of svn is very similar to cvs. Two, that you can do far more with Subversion than with CVS and you can work more reliably with clearer behavior and more predictable results.

There is far, far more that Subversion has to offer, however. This is but a quick glance. Fortunately, the resources available online are of very high quality. In particular, there is an entire book freely available online at http://svnbook.red-bean.com/.

If you find the book useful, don't hesitate to order the print copy (published by O'Reilly, no less); it is an indispensable resource both as a tutorial and introduction and as a reference.

About the author

Chip Turner is a Site Reliability Engineer at Google, Inc. Before that, he spent four years working at Red Hat on the Red Hat Network, perl, and several perl packages for Fedora Core and Red Hat® Enterprise Linux®. He also maintains a number of CPAN modules, contributes to other open source projects, and generally abuses Linux personally and professionally on a daily basis. In his spare time he enjoys playing with his dog and arguing for no apparent reason.