Next: , Previous: Repository Structure, Up: Repository Administration


RCS Format

You do not need to know any of the RCS format to use CVS (although there is an excellent writeup included with the source distribution, see doc/RCSFILES). However, a basic understanding of the format can be of immense help in troubleshooting CVS problems, so we'll take a brief peek into one of the files, hello.c,v. Here are its contents:

     head     1.1;
     branch   1.1.1;
     access   ;
     symbols  start:1.1.1.1 jrandom:1.1.1;
     locks    ; strict;
     comment  @ * @;
     
     1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches 1.1.1.1;
     next;
     
     1.1.1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches ;
     next;
     
     desc
     @@
     
     1.1
     log
     @Initial revision
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
     }
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

Whew! Most of that you can ignore; don't worry about the relationship between 1.1 and 1.1.1.1, for example, or the implied 1.1.1 branch – they aren't really significant from a user's or even an administrator's point of view. What you should try to grok is the overall format. At the top is a collection of header fields:

     head     1.1;
     branch   1.1.1;
     access   ;
     symbols  start:1.1.1.1 jrandom:1.1.1;
     locks    ; strict;
     comment  @ * @;

Farther down in the file are groups of meta-information about each revision (but still not showing the contents of that revision), such as:

     1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches 1.1.1.1;
     next     ;

And finally, the log message and text of an actual revision:

     1.1
     log
     @Initial revision
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
     }
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

If you look closely, you'll see that the first revision's contents are stored under the heading 1.1, but that the log message there is "Initial revision", whereas the log message we actually used at import time was "initial import into CVS", which appears farther down, under Revision 1.1.1.1. You don't need to worry about this discrepancy right now. It happens because imports are a special circumstance: In order to make repeated imports into the same project have a useful effect, import actually places the initial revision on both the main trunk and on a special branch (the reasons for this will become clearer when we look at vendor branches in Advanced CVS). For now, you can treat 1.1 and 1.1.1.1 as the same thing.

The file becomes even more revealing after we commit the first modification to hello.c:

     floss$ cvs -Q co myproj
     floss$ cd myproj
     floss$ emacs hello.c
         (make some changes to the file)
     
     floss$ cvs ci -m "print goodbye too"
     cvs commit: Examining .
     cvs commit: Examining a-subdir
     cvs commit: Examining a-subdir/subsubdir
     cvs commit: Examining b-subdir
     Checking in hello.c;
     /usr/local/newrepos/myproj/hello.c,v  <--  hello.c
     new revision: 1.2; previous revision: 1.1
     done

If you look at hello.c,v in the repository now, you can see the effect of the commit:

     head  1.2;
     access;
     symbols
           start:1.1.1.1 jrandom:1.1.1;
     locks; strict;
     comment   @ * @;
     
     1.2
     date   99.06.21.01.49.40;   author jrandom;   state Exp;
     branches;
     next   1.1;
     
     1.1
     date   99.06.20.17.47.26;   author jrandom;   state Exp;
     branches
            1.1.1.1;
     next   ;
     
     1.1.1.1
     date   99.06.20.17.47.26;   author jrandom;   state Exp;
     branches;
     next   ;
     
     desc
     @@
     
     1.2
     log
     @print goodbye too
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
       printf ("Goodbye, world!\n");
     }
     @
     
     1.1
     log
     @Initial revision
     @
     text
     @d7 1
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

Now the full contents of Revision 1.2 are stored in the file, and the text for Revision 1.1 has been replaced with the cryptic formula:

     d7 1

The d7 1 is a diff code that means "starting at line 7, delete 1 line". In other words, to derive Revision 1.1, delete line 7 from Revision 1.2! Try working through it yourself. You'll see that it does indeed produce Revision 1.1 – it simply does away with the line we added to the file.

This demonstrates the basic principle of RCS format: It stores only the differences between revisions, thereby saving a lot of space compared with storing each revision in full. To go backwards from the most recent revision to the previous one, it patches the later revision using the stored diff. Of course, this means that the further back you travel in the revision history, the more patch operations must be performed (for example, if the file is on Revision 1.7 and CVS is asked to retrieve Revision 1.4, it has to produce 1.6 by patching backwards from 1.7, then 1.5 by patching 1.6, then 1.4 by patching 1.5). Fortunately, old revisions are also the ones least often retrieved, so the RCS system works out pretty well in practice: The more recent the revision, the cheaper it is to obtain.

As for the header information at the top of the file, you don't need to know what all of it means. However, the effects of certain operations show up very clearly in the headers, and a passing familiarity with them may prove useful.

When you commit a new revision on the trunk, the head label is updated (note how it became 1.2 in the preceding example, when the second revision to hello.c was committed). When you add a file as binary or tag it, those operations are recorded in the headers as well. As an example, we'll add foo.jpg as a binary file and then tag it a couple of times:

     floss$ cvs add -kb foo.jpg
     cvs add: scheduling file 'foo.jpg' for addition
     cvs add: use 'cvs commit' to add this file permanently
     floss$ cvs -q commit -m "added a random image; ask jrandom@red-bean.com why"
     RCS file: /usr/local/newrepos/myproj/foo.jpg,v
     done
     Checking in foo.jpg;
     /usr/local/newrepos/myproj/foo.jpg,v  <--  foo.jpg
     initial revision: 1.1
     done
     floss$ cvs tag some_random_tag foo.jpg
     T foo.jpg
     floss$ cvs tag ANOTHER-TAG foo.jpg
     T foo.jpg
     floss$

Now examine the header section of foo.jpg,v in the repository:

     head   1.1;
     access;
     symbols
           ANOTHER-TAG:1.1
           some_random_tag:1.1;
     locks; strict;
     comment   @# @;
     expand	@b@;

Notice the b in the expand line at the end – it's due to our having used the -kb flag when adding the file, and means the file won't undergo any keyword or newline expansions, which would normally occur during checkouts and updates if it were a regular text file. The tags appear in the symbols section, one tag per line – both of them are attached to the first revision, since that's what was tagged both times. (This also helps explain why tag names can only contain letters, numbers, hyphens, and underscores. If the tag itself contained colons or dots, the RCS file's record of it might be ambiguous, because there would be no way to find the textual boundary between the tag and the revision to which it is attached.)

RCS Format Always Quotes @ Signs

The @ symbol is used as a field delimiter in RCS files, which means that if one appears in the text of a file or in a log message, it must be quoted (otherwise, CVS would incorrectly interpret it as marking the end of that field). It is quoted by doubling – that is, CVS always interprets @@ as "literal @ sign", never as "end of current field". When we committed foo.jpg, the log message was

     "added a random image; ask jrandom@red-bean.com why"

which is stored in foo.jpg,v like this:

     1.1
     log
     @added a random image; ask jrandom@@red-bean.com why
     @

The @ sign in jrandom@@red-bean.com will be automatically unquoted whenever CVS retrieves the log message:

     floss$ cvs log foo.jpg
     RCS file: /usr/local/newrepos/myproj/foo.jpg,v
     Working file: foo.jpg
     head: 1.1
     branch:
     locks: strict
     access list:
     symbolic names:
           ANOTHER-TAG: 1.1
           some_random_tag: 1.1
     keyword substitution: b
     total revisions: 1;	selected revisions: 1
     description:
     ----------------------------
     revision 1.1
     date: 1999/06/21 02:56:18;  author: jrandom;  state: Exp;
     added a random image; ask jrandom@red-bean.com why
     ============================================================================
     
     floss$

The only reason you should care is that if you ever find yourself hand-editing RCS files (a rare circumstance, but not unheard of), you must remember to use double @ signs in revision contents and log messages. If you don't, the RCS file will be corrupt and will probably exhibit strange and undesirable behaviors.

Speaking of hand-editing RCS files, don't be fooled by the permissions in the repository:

     floss$ ls -l
     total 6
     -r--r--r--   1 jrandom   users         410 Jun 20 12:47 README.txt,v
     drwxrwxr-x   3 jrandom   users        1024 Jun 20 21:56 a-subdir/
     drwxrwxr-x   2 jrandom   users        1024 Jun 20 21:56 b-subdir/
     -r--r--r--   1 jrandom   users         937 Jun 20 21:56 foo.jpg,v
     -r--r--r--   1 jrandom   users         564 Jun 20 21:11 hello.c,v
     
     floss$

(For those not fluent in Unix ls output, the -r--r--r-- lines on the left essentially mean that the files can be read but not changed.) Although the files appear to be read-only for everyone, the directory permissions must also be taken into account:

     floss$ ls -ld .
     drwxrwxr-x   4 jrandom   users        1024 Jun 20 22:16 ./
     floss$

The myproj/ directory itself – and its subdirectories – are all writeable by the owner (jrandom) and the group (users). This means that CVS (running as jrandom, or as anyone in the users group) can create and delete files in those directories, even if it can't directly edit files already present. CVS edits an RCS file by making a separate copy of it, so you should also make all of your changes in a temporary copy, and then replace the existing RCS file with the new one. (But please don't ask why the files themselves are read-only – there are historical reasons for that, having to do with the way RCS works when run as a standalone program.)

Incidentally, having the files' group be users is probably not what you want, considering that the top-level directory of the repository was explicitly assigned group cvs. You can correct the problem by running this command inside the repository:

     floss$ cd /usr/local/newrepos
     floss$ chgrp -R cvs myproj

The usual Unix file-creation rules govern which group is assigned to new files that appear in the repository, so once in a while you may need to run chgrp or chmod on certain files or directories in the repository (setting the SGID bit with chmod g+s is often a good strategy: it makes children of a directory inherit the directory's group ownership, which is usually what you want in the repository). There are no hard and fast rules about how you should structure repository permissions; it just depends on who is working on what projects.

Karl Fogel wrote this book. Buy a printed copy via his homepage at red-bean.com

copyright  ©  December 22 2014 sean dreilinger url: http://durak.org/sean/pubs/software/cvsbook/RCS-Format.html