Next: , Previous: Making A Change, Up: A Day With CVS


Finding Out What You (And Others) Did – update And diff

Previously, I've talked about updating as a way of bringing changes down from the repository into your working copy – that is, as a way of getting other people's changes. However, update is really a bit more complex; it compares the overall state of the working copy with the state of the project in the repository. Even if nothing in the repository has changed since checkout, something in the working copy may have, and update will show that, too:

     floss$ cvs update
     cvs update: Updating .
     M hello.c
     cvs update: Updating a-subdir
     cvs update: Updating a-subdir/subsubdir
     cvs update: Updating b-subdir

The M next to hello.c means the file has been modified since it was last checked out, and the modifications have not yet been committed to the repository.

Sometimes, merely knowing which files you've edited is all you need. However, if you want a more detailed look at the changes, you can get a full report in diff format. The diff command compares the possibly modified files in the working copy to their counterparts in the repository and displays any differences:

     floss$ cvs diff
     cvs diff: Diffing .
     Index: hello.c
     ===================================================================
     RCS file: /usr/local/cvs/myproj/hello.c,v
     retrieving revision 1.1.1.1
     diff -r1.1.1.1 hello.c
     6a7
     >   printf ("Goodbye, world!\n");
     cvs diff: Diffing a-subdir
     cvs diff: Diffing a-subdir/subsubdir
     cvs diff: Diffing b-subdir

That's helpful, if a bit obscure, but there's still a lot of cruft in the output. For starters, you can ignore most of the first few lines. They just name the repository file and give the number of the last checked-in revision. These are useful pieces of information under other circumstances (we'll look more closely at them later), but you don't need them when you're just trying to get a sense of what changes have been made in the working copy.

A more serious impediment to reading the diff is that CVS is announcing its entry as it goes into each directory during the update. This can be useful during long updates on large projects, as it gives you a sense of how much longer the command will take, but right now it's just getting in the way of reading the diff. Let's tell CVS to be quiet about where it's working, with the -Q global option:

     floss$ cvs -Q diff
     Index: hello.c
     ===================================================================
     RCS file: /usr/local/cvs/myproj/hello.c,v
     retrieving revision 1.1.1.1
     diff -r1.1.1.1 hello.c
     6a7
     >   printf ("Goodbye, world!\n");

Better – at least some of the cruft is gone. However, the diff is still hard to read. It's telling you that at line 6, a new line was added (that is, what became line 7), whose contents were:

     printf ("Goodbye, world!\n");

The preceding ">" in the diff tells you that this line is present in the newer version of the file but not in the older one.

The format could be made even more readable, however. Most people find "context" diff format easier to read because it displays a few lines of context on either side of a change. Context diffs are generated by passing the -c flag to diff:

     floss$ cvs -Q diff -c
     Index: hello.c
     ===================================================================
     RCS file: /usr/local/cvs/myproj/hello.c,v
     retrieving revision 1.1.1.1
     diff -c -r1.1.1.1 hello.c
     *** hello.c     1999/04/18 18:18:22     1.1.1.1
     --- hello.c     1999/04/19 02:17:07
     ***************
     *** 4,7 ****
     ---4,8 --
       main ()
       {
         printf ("Hello, world!\n");
     +   printf ("Goodbye, world!\n");
       }

Now that's clarity! Even if you're not used to reading context diffs, a glance at the preceding output will probably make it obvious what happened: a new line was added (the + in the first column signifies an added line) between the line that prints Hello, world! and the final curly brace.

We don't need to be able to read context diffs perfectly (that's patch's job), but it's worth taking the time to acquire at least a passing familiarity with the format. The first two lines (after the introductory cruft) are

     *** hello.c     1999/04/18 18:18:22     1.1.1.1
     --- hello.c     1999/04/19 02:17:07

and they tell you what is being diffed against what. In this case, revision 1.1.1.1 of hello.c is being compared against a modified version of the same file (thus, there's no revision number for the second line, because only the working copy's changes haven't been committed to the repository yet). The lines of asterisks and dashes identify sections farther down in the diff. Later on, a line of asterisks, with a line number range embedded, precedes a section from the original file. Then a line of dashes, with a new and potentially different line number range embedded, precedes a section from the modified file. These sections are organized into contrasting pairs (known as "hunks"), one side from the old file and the other side from the new.

Our diff has one hunk:

     ***************
     *** 4,7 ****
     --- 4,8 --
       main ()
       {
         printf ("Hello, world!\n");
     +   printf ("Goodbye, world!\n");
       }

The first section of the hunk is empty, meaning that no material was removed from the original file. The second section shows that, in the corresponding place in the new file, one line has been added; it's marked with a "+". (When diff quotes excerpts from files, it reserves the first two columns on the left for special codes, such as "+" so the entire excerpt appears to be indented by two spaces. This extra indentation is stripped off when the diff is applied, of course.)

The line number ranges show the hunk's coverage, including context lines. In the original file, the hunk was in lines 4 through 7; in the new file, it's lines 4 through 8 (because a line has been added). Note that the diff didn't need to show any material from the original file because nothing was removed; it just showed the range and moved on to the second half of the hunk.

Here's another context diff, from an actual project of mine:

     floss$ cvs -Q diff -c
     Index: cvs2cl.pl
     ===================================================================
     RCS file: /usr/local/cvs/kfogel/code/cvs2cl/cvs2cl.pl,v
     retrieving revision 1.76
     diff -c -r1.76 cvs2cl.pl
     *** cvs2cl.pl   1999/04/13 22:29:44     1.76
     --- cvs2cl.pl   1999/04/19 05:41:37
     ***************
     *** 212,218 ****
               # can contain uppercase and lowercase letters, digits, '-',
               # and '_'. However, it's not our place to enforce that, so
               # we'll allow anything CVS hands us to be a tag:
     !         /^\s([^:]+): ([0-9.]+)$/;
               push (@{$symbolic_names{$2}}, $1);
             }
           }
     -- 212,218 --
               # can contain uppercase and lowercase letters, digits, '-',
               # and '_'. However, it's not our place to enforce that, so
               # we'll allow anything CVS hands us to be a tag:
     !         /^\s([^:]+): ([\d.]+)$/;
               push (@{$symbolic_names{$2}}, $1);
             }
           }

The exclamation point shows that the marked line differs between the old and new files. Since there are no "+" or "-" signs, we know that the total number of lines in the file has remained the same.

Here's one more context diff from the same project, slightly more complex this time:

     floss$ cvs -Q diff -c
     Index: cvs2cl.pl
     ===================================================================
     RCS file: /usr/local/cvs/kfogel/code/cvs2cl/cvs2cl.pl,v
     retrieving revision 1.76
     diff -c -r1.76 cvs2cl.pl
     *** cvs2cl.pl   1999/04/13 22:29:44     1.76
     --- cvs2cl.pl   1999/04/19 05:58:51
     ***************
     *** 207,217 ****
     }
             else    # we're looking at a tag name, so parse & store it
             {
     -         # According to the Cederqvist manual, in node "Tags", "Tag
     -         # names must start with an uppercase or lowercase letter and
     -         # can contain uppercase and lowercase letters, digits, '-',
     -         # and '_'. However, it's not our place to enforce that, so
     -         # we'll allow anything CVS hands us to be a tag:
               /^\s([^:]+): ([0-9.]+)$/;
               push (@{$symbolic_names{$2}}, $1);
             }
     - 207,212 --
     ***************
     *** 223,228 ****
     --- 218,225 --
           if (/^revision (\d\.[0-9.]+)$/) {
             $revision = "$1";
           }
     +
     +     # This line was added, I admit, solely for the sake of a diff example.
     
           # If have file name but not time and author, and see date or
           # author, then grab them:

This diff has two hunks. In the first, five lines were removed (these lines are only shown in the first section of the hunk, and the second section's line count shows that it has five fewer lines). An unbroken line of asterisks forms the boundary between hunks, and in the second hunk we see that two lines have been added: a blank line and a pointless comment. Note how the line numbers compensate for the effect of the previous hunk. In the original file, the second hunk's range of the area was lines 223 through 228; in the new file, because of the deletion that took place in the first hunk, the range is in lines 218 through 225.

Congratulations, you are probably now as expert as you'll ever need to be at reading diffs.

Karl Fogel wrote this book. Buy a printed copy via his homepage at red-bean.com

copyright  ©  November 12 2019 sean dreilinger url: https://durak.org/sean/pubs/software/cvsbook/Finding-Out-What-You-_0028And-Others_0029-Did-_002d_002d-update-And-diff.html