gnu/usr.bin/diff3/README

   1 This directory contains the GNU DIFF and DIFF3 utilities, version 1.15.
   2 See file COPYING for copying conditions.  To compile and install on
   3 system V, you must edit the makefile according to comments therein.
   4
   5 Report bugs to bug-gnu-utils@prep.ai.mit.edu
   6
   7 Version 1.15 has the following new features; please see below for details.
   8
   9    -L (+file-label) option
  10    -u (+unified) option
  11    -a and -m options for diff3
  12    Most output styles can represent incomplete input lines.
  13    `Text' is defined by ISO 8859.
  14    diff3 exit status 0 means success, 1 means overlaps, 2 means trouble.
  15
  16
  17 This version of diff provides all the features of BSD's diff.
  18 It has these additional features:
  19
  20    An input file may end in a non-newline character.  If so, its last
  21    line is called an incomplete line and is distinguished on output
  22    from a full line.  In the default, -c, and -u output styles, an
  23    incomplete output line is followed by a diagnostic line that starts
  24    with \.  With -n, an incomplete line is output without a trailing
  25    newline.  Other output styles (-D, -e, -f) cannot represent an
  26    incomplete line, so they pretend that there was a newline, and -e and -f
  27    also print an error message.  For example, suppose F and G are one-byte
  28    files that contain just ``f'' and ``g'', respectively.
  29
  30    Then ``diff F G'' outputs
  31
  32         1c1
  33         < f
  34         \ No newline at end of file
  35         ---
  36         > g
  37         \ No newline at end of file
  38
  39    (The exact diagnostic message may differ, e.g. for non-English locales.)
  40    ``diff -n F G'' outputs the following without a trailing newline:
  41
  42         d1 1
  43         a1 1
  44         g
  45
  46    ``diff -e F G'' sends two diagnostics to stderr and the following to stdout:
  47
  48         1c
  49         g
  50         .
  51
  52    A file is considered to be text if its first characters are all in the
  53    ISO 8859 character set; BSD's diff uses Ascii.
  54
  55    GNU DIFF has the following additional options:
  56
  57    -a   Always treat files as text and compare them line-by-line,
  58         even if they do not appear to be text.
  59
  60    -B   ignore changes that just insert or delete blank lines.
  61
  62    -C #
  63         request -c format and specify number of context lines.
  64
  65    -F regexp
  66         in context format, for each unit of differences, show some of
  67         the last preceding line that matches the specified regexp.
  68
  69    -H   use heuristics to speed handling of large files that
  70         have numerous scattered small changes.  The algorithm becomes
  71         asymptotically linear for such files!
  72
  73    -I regexp
  74         ignore changes that just insert or delete lines that
  75         match the specified regexp.
  76
  77    -L label
  78         Use the specified label in file header lines output by the -c option.
  79         This option may be given zero, one, or two times,
  80         to affect neither label, just the first file's label, or both labels.
  81         A file's default label is its name, a tab, and its modification date.
  82
  83    -N   in directory comparison, if a file is found in only one directory,
  84         treat it as present but empty in the other directory.
  85
  86    -p   equivalent to -c -F'^[_a-zA-Z]'.  This is useful for C code
  87         because it shows which function each change is in.
  88
  89    -T   print a tab rather than a space before the text of a line
  90         in normal or context format.  This causes the alignment
  91         of tabs in the line to look normal.
  92
  93    -u[#]
  94         produce unified style output with # context lines (default 3).
  95         This style is like -c, but it is more compact because context
  96         lines are printed only once.  Lines from just the first file
  97         are marked '-'; lines from just the second file are marked '+'.
  98
  99 This version of diff3 has all of BSD diff3's features, with the following
 100 additional features.
 101
 102    An input file may end in a non-newline character.  With the -m option,
 103    an incomplete last line stays incomplete.  Other output styles treat
 104    incomplete lines like diff.
 105
 106    The file name '-' denotes the standard input.  It can appear at most once.
 107
 108    diff3 has the following additional options:
 109
 110    -a   Always treat files as text and compare them line-by-line,
 111         even if they do not appear to be text.
 112
 113    -i   Include 'w' and 'q' commands at the end of the output, to write out
 114         the changed file, thus emulating system V behavior.  One of the edit
 115         script options -e, -E, -x, -X, -3 must also be specified.
 116
 117    -m   Apply the edit script to the first file and send the result to
 118         standard output.  Unlike piping diff3's output to ed(1), this works
 119         even for binary files and incomplete lines.  -E is assumed if no edit
 120         script option is specified.  This option is incompatible with -i.
 121
 122    -L label
 123         Use the specified label for lines output by the -E and -X options,
 124         one of which must also be specified.  This option may be given zero,
 125         one, or two times; the first label marks <<<<<<< lines and the second
 126         marks >>>>>>> lines.  The default labels are the names of the first and
 127         third files on the command line.  Thus ``diff3 -L X -L Z -E A B C''
 128         acts like ``diff3 -E A B C'', except that the output looks like it
 129         came from files named X and Z rather than from files named A and C.
 130
 131     Exit status 0 means success, 1 means overlaps were found and -E or -X was
 132     specified, and 2 means trouble.
 133
 134
 135
 136 GNU DIFF was written by Mike Haertel, David Hayes, Richard Stallman
 137 and Len Tower.  The basic algorithm is described in: "An O(ND)
 138 Difference Algorithm and its Variations", Eugene Myers, Algorithmica
 139 Vol. 1 No. 2, 1986, p 251.
 140
 141 Many bugs were fixed by Paul Eggert.  The unified diff idea and format
 142 are from Wayne Davison.
 143
 144 Suggested projects for improving GNU DIFF:
 145
 146 * Handle very large files by not keeping the entire text in core.
 147
 148 One way to do this is to scan the files sequentally to compute hash
 149 codes of the lines and put the lines in equivalence classes based only
 150 on hash code.  Then compare the files normally.  This will produce
 151 some false matches.
 152
 153 Then scan the two files sequentially again, checking each match to see
 154 whether it is real.  When a match is not real, mark both the
 155 "matching" lines as changed.  Then build an edit script as usual.
 156
 157 The output routines would have to be changed to scan the files
 158 sequentially looking for the text to print.