src/oldsed/doc/sed.info-2

   1 This is ../../doc/sed.info, produced by makeinfo version 4.5 from
   2 ../../doc/sed.texi.
   3
   4 INFO-DIR-SECTION Text creation and manipulation
   5 START-INFO-DIR-ENTRY
   6 * sed: (sed).                   Stream EDitor.
   7
   8 END-INFO-DIR-ENTRY
   9
  10 This file documents version 4.1.5 of GNU `sed', a stream editor.
  11
  12    Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
  13 Foundation, Inc.
  14
  15    This document is released under the terms of the GNU Free
  16 Documentation License as published by the Free Software Foundation;
  17 either version 1.1, or (at your option) any later version.
  18
  19    You should have received a copy of the GNU Free Documentation
  20 License along with GNU `sed'; see the file `COPYING.DOC'.  If not,
  21 write to the Free Software Foundation, 59 Temple Place - Suite 330,
  22 Boston, MA 02110-1301, USA.
  23
  24    There are no Cover Texts and no Invariant Sections; this text, along
  25 with its equivalent in the printed manual, constitutes the Title Page.
  26 \x1f
  27 File: sed.info,  Node: Print bash environment,  Next: Reverse chars of lines,  Prev: Rename files to lower case,  Up: Examples
  28
  29 Print `bash' Environment
  30 ========================
  31
  32    This script strips the definition of the shell functions from the
  33 output of the `set' Bourne-shell command.
  34
  35      #!/bin/sh
  36
  37      set | sed -n '
  38      :x
  39
  40      # if no occurrence of "=()" print and load next line
  41      /=()/! { p; b; }
  42      / () $/! { p; b; }
  43
  44      # possible start of functions section
  45      # save the line in case this is a var like FOO="() "
  46      h
  47
  48      # if the next line has a brace, we quit because
  49      # nothing comes after functions
  50      n
  51      /^{/ q
  52
  53      # print the old line
  54      x; p
  55
  56      # work on the new line now
  57      x; bx
  58      '
  59
  60 \x1f
  61 File: sed.info,  Node: Reverse chars of lines,  Next: tac,  Prev: Print bash environment,  Up: Examples
  62
  63 Reverse Characters of Lines
  64 ===========================
  65
  66    This script can be used to reverse the position of characters in
  67 lines.  The technique moves two characters at a time, hence it is
  68 faster than more intuitive implementations.
  69
  70    Note the `tx' command before the definition of the label.  This is
  71 often needed to reset the flag that is tested by the `t' command.
  72
  73    Imaginative readers will find uses for this script.  An example is
  74 reversing the output of `banner'.(1)
  75
  76      #!/usr/bin/sed -f
  77
  78      /../! b
  79
  80      # Reverse a line.  Begin embedding the line between two newlines
  81      s/^.*$/\
  82      &\
  83      /
  84
  85      # Move first character at the end.  The regexp matches until
  86      # there are zero or one characters between the markers
  87      tx
  88      :x
  89      s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/
  90      tx
  91
  92      # Remove the newline markers
  93      s/\n//g
  94
  95    ---------- Footnotes ----------
  96
  97    (1) This requires another script to pad the output of banner; for
  98 example
  99
 100      #! /bin/sh
 101
 102      banner -w $1 $2 $3 $4 |
 103        sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' |
 104        ~/sedscripts/reverseline.sed
 105
 106 \x1f
 107 File: sed.info,  Node: tac,  Next: cat -n,  Prev: Reverse chars of lines,  Up: Examples
 108
 109 Reverse Lines of Files
 110 ======================
 111
 112    This one begins a series of totally useless (yet interesting)
 113 scripts emulating various Unix commands.  This, in particular, is a
 114 `tac' workalike.
 115
 116    Note that on implementations other than GNU `sed' this script might
 117 easily overflow internal buffers.
 118
 119      #!/usr/bin/sed -nf
 120
 121      # reverse all lines of input, i.e. first line became last, ...
 122
 123      # from the second line, the buffer (which contains all previous lines)
 124      # is *appended* to current line, so, the order will be reversed
 125      1! G
 126
 127      # on the last line we're done -- print everything
 128      $ p
 129
 130      # store everything on the buffer again
 131      h
 132
 133 \x1f
 134 File: sed.info,  Node: cat -n,  Next: cat -b,  Prev: tac,  Up: Examples
 135
 136 Numbering Lines
 137 ===============
 138
 139    This script replaces `cat -n'; in fact it formats its output exactly
 140 like GNU `cat' does.
 141
 142    Of course this is completely useless and for two reasons:  first,
 143 because somebody else did it in C, second, because the following
 144 Bourne-shell script could be used for the same purpose and would be
 145 much faster:
 146
 147      #! /bin/sh
 148      sed -e "=" $@ | sed -e '
 149        s/^/      /
 150        N
 151        s/^ *\(......\)\n/\1  /
 152      '
 153
 154    It uses `sed' to print the line number, then groups lines two by two
 155 using `N'.  Of course, this script does not teach as much as the one
 156 presented below.
 157
 158    The algorithm used for incrementing uses both buffers, so the line
 159 is printed as soon as possible and then discarded.  The number is split
 160 so that changing digits go in a buffer and unchanged ones go in the
 161 other; the changed digits are modified in a single step (using a `y'
 162 command).  The line number for the next line is then composed and
 163 stored in the hold space, to be used in the next iteration.
 164
 165      #!/usr/bin/sed -nf
 166
 167      # Prime the pump on the first line
 168      x
 169      /^$/ s/^.*$/1/
 170
 171      # Add the correct line number before the pattern
 172      G
 173      h
 174
 175      # Format it and print it
 176      s/^/      /
 177      s/^ *\(......\)\n/\1  /p
 178
 179      # Get the line number from hold space; add a zero
 180      # if we're going to add a digit on the next line
 181      g
 182      s/\n.*$//
 183      /^9*$/ s/^/0/
 184
 185      # separate changing/unchanged digits with an x
 186      s/.9*$/x&/
 187
 188      # keep changing digits in hold space
 189      h
 190      s/^.*x//
 191      y/0123456789/1234567890/
 192      x
 193
 194      # keep unchanged digits in pattern space
 195      s/x.*$//
 196
 197      # compose the new number, remove the newline implicitly added by G
 198      G
 199      s/\n//
 200      h
 201
 202 \x1f
 203 File: sed.info,  Node: cat -b,  Next: wc -c,  Prev: cat -n,  Up: Examples
 204
 205 Numbering Non-blank Lines
 206 =========================
 207
 208    Emulating `cat -b' is almost the same as `cat -n'--we only have to
 209 select which lines are to be numbered and which are not.
 210
 211    The part that is common to this script and the previous one is not
 212 commented to show how important it is to comment `sed' scripts
 213 properly...
 214
 215      #!/usr/bin/sed -nf
 216
 217      /^$/ {
 218        p
 219        b
 220      }
 221
 222      # Same as cat -n from now
 223      x
 224      /^$/ s/^.*$/1/
 225      G
 226      h
 227      s/^/      /
 228      s/^ *\(......\)\n/\1  /p
 229      x
 230      s/\n.*$//
 231      /^9*$/ s/^/0/
 232      s/.9*$/x&/
 233      h
 234      s/^.*x//
 235      y/0123456789/1234567890/
 236      x
 237      s/x.*$//
 238      G
 239      s/\n//
 240      h
 241
 242 \x1f
 243 File: sed.info,  Node: wc -c,  Next: wc -w,  Prev: cat -b,  Up: Examples
 244
 245 Counting Characters
 246 ===================
 247
 248    This script shows another way to do arithmetic with `sed'.  In this
 249 case we have to add possibly large numbers, so implementing this by
 250 successive increments would not be feasible (and possibly even more
 251 complicated to contrive than this script).
 252
 253    The approach is to map numbers to letters, kind of an abacus
 254 implemented with `sed'.  `a's are units, `b's are tens and so on: we
 255 simply add the number of characters on the current line as units, and
 256 then propagate the carry to tens, hundreds, and so on.
 257
 258    As usual, running totals are kept in hold space.
 259
 260    On the last line, we convert the abacus form back to decimal.  For
 261 the sake of variety, this is done with a loop rather than with some 80
 262 `s' commands(1): first we convert units, removing `a's from the number;
 263 then we rotate letters so that tens become `a's, and so on until no
 264 more letters remain.
 265
 266      #!/usr/bin/sed -nf
 267
 268      # Add n+1 a's to hold space (+1 is for the newline)
 269      s/./a/g
 270      H
 271      x
 272      s/\n/a/
 273
 274      # Do the carry.  The t's and b's are not necessary,
 275      # but they do speed up the thing
 276      t a
 277      : a;  s/aaaaaaaaaa/b/g; t b; b done
 278      : b;  s/bbbbbbbbbb/c/g; t c; b done
 279      : c;  s/cccccccccc/d/g; t d; b done
 280      : d;  s/dddddddddd/e/g; t e; b done
 281      : e;  s/eeeeeeeeee/f/g; t f; b done
 282      : f;  s/ffffffffff/g/g; t g; b done
 283      : g;  s/gggggggggg/h/g; t h; b done
 284      : h;  s/hhhhhhhhhh//g
 285
 286      : done
 287      $! {
 288        h
 289        b
 290      }
 291
 292      # On the last line, convert back to decimal
 293
 294      : loop
 295      /a/! s/[b-h]*/&0/
 296      s/aaaaaaaaa/9/
 297      s/aaaaaaaa/8/
 298      s/aaaaaaa/7/
 299      s/aaaaaa/6/
 300      s/aaaaa/5/
 301      s/aaaa/4/
 302      s/aaa/3/
 303      s/aa/2/
 304      s/a/1/
 305
 306      : next
 307      y/bcdefgh/abcdefg/
 308      /[a-h]/ b loop
 309      p
 310
 311    ---------- Footnotes ----------
 312
 313    (1) Some implementations have a limit of 199 commands per script
 314
 315 \x1f
 316 File: sed.info,  Node: wc -w,  Next: wc -l,  Prev: wc -c,  Up: Examples
 317
 318 Counting Words
 319 ==============
 320
 321    This script is almost the same as the previous one, once each of the
 322 words on the line is converted to a single `a' (in the previous script
 323 each letter was changed to an `a').
 324
 325    It is interesting that real `wc' programs have optimized loops for
 326 `wc -c', so they are much slower at counting words rather than
 327 characters.  This script's bottleneck, instead, is arithmetic, and
 328 hence the word-counting one is faster (it has to manage smaller
 329 numbers).
 330
 331    Again, the common parts are not commented to show the importance of
 332 commenting `sed' scripts.
 333
 334      #!/usr/bin/sed -nf
 335
 336      # Convert words to a's
 337      s/[ tab][ tab]*/ /g
 338      s/^/ /
 339      s/ [^ ][^ ]*/a /g
 340      s/ //g
 341
 342      # Append them to hold space
 343      H
 344      x
 345      s/\n//
 346
 347      # From here on it is the same as in wc -c.
 348      /aaaaaaaaaa/! bx;   s/aaaaaaaaaa/b/g
 349      /bbbbbbbbbb/! bx;   s/bbbbbbbbbb/c/g
 350      /cccccccccc/! bx;   s/cccccccccc/d/g
 351      /dddddddddd/! bx;   s/dddddddddd/e/g
 352      /eeeeeeeeee/! bx;   s/eeeeeeeeee/f/g
 353      /ffffffffff/! bx;   s/ffffffffff/g/g
 354      /gggggggggg/! bx;   s/gggggggggg/h/g
 355      s/hhhhhhhhhh//g
 356      :x
 357      $! { h; b; }
 358      :y
 359      /a/! s/[b-h]*/&0/
 360      s/aaaaaaaaa/9/
 361      s/aaaaaaaa/8/
 362      s/aaaaaaa/7/
 363      s/aaaaaa/6/
 364      s/aaaaa/5/
 365      s/aaaa/4/
 366      s/aaa/3/
 367      s/aa/2/
 368      s/a/1/
 369      y/bcdefgh/abcdefg/
 370      /[a-h]/ by
 371      p
 372
 373 \x1f
 374 File: sed.info,  Node: wc -l,  Next: head,  Prev: wc -w,  Up: Examples
 375
 376 Counting Lines
 377 ==============
 378
 379    No strange things are done now, because `sed' gives us `wc -l'
 380 functionality for free!!! Look:
 381
 382      #!/usr/bin/sed -nf
 383      $=
 384
 385 \x1f
 386 File: sed.info,  Node: head,  Next: tail,  Prev: wc -l,  Up: Examples
 387
 388 Printing the First Lines
 389 ========================
 390
 391    This script is probably the simplest useful `sed' script.  It
 392 displays the first 10 lines of input; the number of displayed lines is
 393 right before the `q' command.
 394
 395      #!/usr/bin/sed -f
 396      10q
 397
 398 \x1f
 399 File: sed.info,  Node: tail,  Next: uniq,  Prev: head,  Up: Examples
 400
 401 Printing the Last Lines
 402 =======================
 403
 404    Printing the last N lines rather than the first is more complex but
 405 indeed possible.  N is encoded in the second line, before the bang
 406 character.
 407
 408    This script is similar to the `tac' script in that it keeps the
 409 final output in the hold space and prints it at the end:
 410
 411      #!/usr/bin/sed -nf
 412
 413      1! {; H; g; }
 414      1,10 !s/[^\n]*\n//
 415      $p
 416      h
 417
 418    Mainly, the scripts keeps a window of 10 lines and slides it by
 419 adding a line and deleting the oldest (the substitution command on the
 420 second line works like a `D' command but does not restart the loop).
 421
 422    The "sliding window" technique is a very powerful way to write
 423 efficient and complex `sed' scripts, because commands like `P' would
 424 require a lot of work if implemented manually.
 425
 426    To introduce the technique, which is fully demonstrated in the rest
 427 of this chapter and is based on the `N', `P' and `D' commands, here is
 428 an implementation of `tail' using a simple "sliding window."
 429
 430    This looks complicated but in fact the working is the same as the
 431 last script: after we have kicked in the appropriate number of lines,
 432 however, we stop using the hold space to keep inter-line state, and
 433 instead use `N' and `D' to slide pattern space by one line:
 434
 435      #!/usr/bin/sed -f
 436
 437      1h
 438      2,10 {; H; g; }
 439      $q
 440      1,9d
 441      N
 442      D
 443
 444    Note how the first, second and fourth line are inactive after the
 445 first ten lines of input.  After that, all the script does is: exiting
 446 on the last line of input, appending the next input line to pattern
 447 space, and removing the first line.
 448
 449 \x1f
 450 File: sed.info,  Node: uniq,  Next: uniq -d,  Prev: tail,  Up: Examples
 451
 452 Make Duplicate Lines Unique
 453 ===========================
 454
 455    This is an example of the art of using the `N', `P' and `D'
 456 commands, probably the most difficult to master.
 457
 458      #!/usr/bin/sed -f
 459      h
 460
 461      :b
 462      # On the last line, print and exit
 463      $b
 464      N
 465      /^\(.*\)\n\1$/ {
 466          # The two lines are identical.  Undo the effect of
 467          # the n command.
 468          g
 469          bb
 470      }
 471
 472      # If the `N' command had added the last line, print and exit
 473      $b
 474
 475      # The lines are different; print the first and go
 476      # back working on the second.
 477      P
 478      D
 479
 480    As you can see, we mantain a 2-line window using `P' and `D'.  This
 481 technique is often used in advanced `sed' scripts.
 482
 483 \x1f
 484 File: sed.info,  Node: uniq -d,  Next: uniq -u,  Prev: uniq,  Up: Examples
 485
 486 Print Duplicated Lines of Input
 487 ===============================
 488
 489    This script prints only duplicated lines, like `uniq -d'.
 490
 491      #!/usr/bin/sed -nf
 492
 493      $b
 494      N
 495      /^\(.*\)\n\1$/ {
 496          # Print the first of the duplicated lines
 497          s/.*\n//
 498          p
 499
 500          # Loop until we get a different line
 501          :b
 502          $b
 503          N
 504          /^\(.*\)\n\1$/ {
 505              s/.*\n//
 506              bb
 507          }
 508      }
 509
 510      # The last line cannot be followed by duplicates
 511      $b
 512
 513      # Found a different one.  Leave it alone in the pattern space
 514      # and go back to the top, hunting its duplicates
 515      D
 516
 517 \x1f
 518 File: sed.info,  Node: uniq -u,  Next: cat -s,  Prev: uniq -d,  Up: Examples
 519
 520 Remove All Duplicated Lines
 521 ===========================
 522
 523    This script prints only unique lines, like `uniq -u'.
 524
 525      #!/usr/bin/sed -f
 526
 527      # Search for a duplicate line --- until that, print what you find.
 528      $b
 529      N
 530      /^\(.*\)\n\1$/ ! {
 531          P
 532          D
 533      }
 534
 535      :c
 536      # Got two equal lines in pattern space.  At the
 537      # end of the file we simply exit
 538      $d
 539
 540      # Else, we keep reading lines with `N' until we
 541      # find a different one
 542      s/.*\n//
 543      N
 544      /^\(.*\)\n\1$/ {
 545          bc
 546      }
 547
 548      # Remove the last instance of the duplicate line
 549      # and go back to the top
 550      D
 551
 552 \x1f
 553 File: sed.info,  Node: cat -s,  Prev: uniq -u,  Up: Examples
 554
 555 Squeezing Blank Lines
 556 =====================
 557
 558    As a final example, here are three scripts, of increasing complexity
 559 and speed, that implement the same function as `cat -s', that is
 560 squeezing blank lines.
 561
 562    The first leaves a blank line at the beginning and end if there are
 563 some already.
 564
 565      #!/usr/bin/sed -f
 566
 567      # on empty lines, join with next
 568      # Note there is a star in the regexp
 569      :x
 570      /^\n*$/ {
 571      N
 572      bx
 573      }
 574
 575      # now, squeeze all '\n', this can be also done by:
 576      # s/^\(\n\)*/\1/
 577      s/\n*/\
 578      /
 579
 580    This one is a bit more complex and removes all empty lines at the
 581 beginning.  It does leave a single blank line at end if one was there.
 582
 583      #!/usr/bin/sed -f
 584
 585      # delete all leading empty lines
 586      1,/^./{
 587      /./!d
 588      }
 589
 590      # on an empty line we remove it and all the following
 591      # empty lines, but one
 592      :x
 593      /./!{
 594      N
 595      s/^\n$//
 596      tx
 597      }
 598
 599    This removes leading and trailing blank lines.  It is also the
 600 fastest.  Note that loops are completely done with `n' and `b', without
 601 relying on `sed' to restart the the script automatically at the end of
 602 a line.
 603
 604      #!/usr/bin/sed -nf
 605
 606      # delete all (leading) blanks
 607      /./!d
 608
 609      # get here: so there is a non empty
 610      :x
 611      # print it
 612      p
 613      # get next
 614      n
 615      # got chars? print it again, etc...
 616      /./bx
 617
 618      # no, don't have chars: got an empty line
 619      :z
 620      # get next, if last line we finish here so no trailing
 621      # empty lines are written
 622      n
 623      # also empty? then ignore it, and get next... this will
 624      # remove ALL empty lines
 625      /./!bz
 626
 627      # all empty lines were deleted/ignored, but we have a non empty.  As
 628      # what we want to do is to squeeze, insert a blank line artificially
 629      i\
 630
 631      bx
 632
 633 \x1f
 634 File: sed.info,  Node: Limitations,  Next: Other Resources,  Prev: Examples,  Up: Top
 635
 636 GNU `sed''s Limitations and Non-limitations
 637 *******************************************
 638
 639    For those who want to write portable `sed' scripts, be aware that
 640 some implementations have been known to limit line lengths (for the
 641 pattern and hold spaces) to be no more than 4000 bytes.  The POSIX
 642 standard specifies that conforming `sed' implementations shall support
 643 at least 8192 byte line lengths.  GNU `sed' has no built-in limit on
 644 line length; as long as it can `malloc()' more (virtual) memory, you
 645 can feed or construct lines as long as you like.
 646
 647    However, recursion is used to handle subpatterns and indefinite
 648 repetition.  This means that the available stack space may limit the
 649 size of the buffer that can be processed by certain patterns.
 650
 651 \x1f
 652 File: sed.info,  Node: Other Resources,  Next: Reporting Bugs,  Prev: Limitations,  Up: Top
 653
 654 Other Resources for Learning About `sed'
 655 ****************************************
 656
 657    In addition to several books that have been written about `sed'
 658 (either specifically or as chapters in books which discuss shell
 659 programming), one can find out more about `sed' (including suggestions
 660 of a few books) from the FAQ for the `sed-users' mailing list,
 661 available from any of:
 662       `http://www.student.northpark.edu/pemente/sed/sedfaq.html'
 663       `http://sed.sf.net/grabbag/tutorials/sedfaq.html'
 664
 665    Also of interest are
 666 `http://www.student.northpark.edu/pemente/sed/index.htm' and
 667 `http://sed.sf.net/grabbag', which include `sed' tutorials and other
 668 `sed'-related goodies.
 669
 670    The `sed-users' mailing list itself maintained by Sven Guckes.  To
 671 subscribe, visit `http://groups.yahoo.com' and search for the
 672 `sed-users' mailing list.
 673
 674 \x1f
 675 File: sed.info,  Node: Reporting Bugs,  Next: Extended regexps,  Prev: Other Resources,  Up: Top
 676
 677 Reporting Bugs
 678 **************
 679
 680    Email bug reports to <bonzini@gnu.org>.  Be sure to include the word
 681 "sed" somewhere in the `Subject:' field.  Also, please include the
 682 output of `sed --version' in the body of your report if at all possible.
 683
 684    Please do not send a bug report like this:
 685
 686      while building frobme-1.3.4
 687      $ configure
 688      error--> sed: file sedscr line 1: Unknown option to 's'
 689
 690    If GNU `sed' doesn't configure your favorite package, take a few
 691 extra minutes to identify the specific problem and make a stand-alone
 692 test case.  Unlike other programs such as C compilers, making such test
 693 cases for `sed' is quite simple.
 694
 695    A stand-alone test case includes all the data necessary to perform
 696 the test, and the specific invocation of `sed' that causes the problem.
 697 The smaller a stand-alone test case is, the better.  A test case should
 698 not involve something as far removed from `sed' as "try to configure
 699 frobme-1.3.4".  Yes, that is in principle enough information to look
 700 for the bug, but that is not a very practical prospect.
 701
 702    Here are a few commonly reported bugs that are not bugs.
 703
 704 `N' command on the last line
 705      Most versions of `sed' exit without printing anything when the `N'
 706      command is issued on the last line of a file.  GNU `sed' prints
 707      pattern space before exiting unless of course the `-n' command
 708      switch has been specified.  This choice is by design.
 709
 710      For example, the behavior of
 711           sed N foo bar
 712
 713      would depend on whether foo has an even or an odd number of
 714      lines(1).  Or, when writing a script to read the next few lines
 715      following a pattern match, traditional implementations of `sed'
 716      would force you to write something like
 717           /foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N }
 718
 719      instead of just
 720           /foo/{ N;N;N;N;N;N;N;N;N; }
 721
 722      In any case, the simplest workaround is to use `$d;N' in scripts
 723      that rely on the traditional behavior, or to set the
 724      `POSIXLY_CORRECT' variable to a non-empty value.
 725
 726 Regex syntax clashes (problems with backslashes)
 727      `sed' uses the POSIX basic regular expression syntax.  According to
 728      the standard, the meaning of some escape sequences is undefined in
 729      this syntax;  notable in the case of `sed' are `\|', `\+', `\?',
 730      `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.
 731
 732      As in all GNU programs that use POSIX basic regular expressions,
 733      `sed' interprets these escape sequences as special characters.
 734      So, `x\+' matches one or more occurrences of `x'.  `abc\|def'
 735      matches either `abc' or `def'.
 736
 737      This syntax may cause problems when running scripts written for
 738      other `sed's.  Some `sed' programs have been written with the
 739      assumption that `\|' and `\+' match the literal characters `|' and
 740      `+'.  Such scripts must be modified by removing the spurious
 741      backslashes if they are to be used with modern implementations of
 742      `sed', like GNU `sed'.
 743
 744      On the other hand, some scripts use s|abc\|def||g to remove
 745      occurrences of _either_ `abc' or `def'.  While this worked until
 746      `sed' 4.0.x, newer versions interpret this as removing the string
 747      `abc|def'.  This is again undefined behavior according to POSIX,
 748      and this interpretation is arguably more robust: older `sed's, for
 749      example, required that the regex matcher parsed `\/' as `/' in the
 750      common case of escaping a slash, which is again undefined
 751      behavior; the new behavior avoids this, and this is good because
 752      the regex matcher is only partially under our control.
 753
 754      In addition, this version of `sed' supports several escape
 755      characters (some of which are multi-character) to insert
 756      non-printable characters in scripts (`\a', `\c', `\d', `\o', `\r',
 757      `\t', `\v', `\x').  These can cause similar problems with scripts
 758      written for other `sed's.
 759
 760 `-i' clobbers read-only files
 761      In short, `sed -i' will let you delete the contents of a read-only
 762      file, and in general the `-i' option (*note Invocation: Invoking
 763      sed.) lets you clobber protected files.  This is not a bug, but
 764      rather a consequence of how the Unix filesystem works.
 765
 766      The permissions on a file say what can happen to the data in that
 767      file, while the permissions on a directory say what can happen to
 768      the list of files in that directory.  `sed -i' will not ever open
 769      for writing  a file that is already on disk.  Rather, it will work
 770      on a temporary file that is finally renamed to the original name:
 771      if you rename or delete files, you're actually modifying the
 772      contents of the directory, so the operation depends on the
 773      permissions of the directory, not of the file.  For this same
 774      reason, `sed' does not let you use `-i' on a writeable file in a
 775      read-only directory (but unbelievably nobody reports that as a
 776      bug...).
 777
 778 `0a' does not work (gives an error)
 779      There is no line 0.  0 is a special address that is only used to
 780      treat addresses like `0,/RE/' as active when the script starts: if
 781      you write `1,/abc/d' and the first line includes the word `abc',
 782      then that match would be ignored because address ranges must span
 783      at least two lines (barring the end of the file); but what you
 784      probably wanted is to delete every line up to the first one
 785      including `abc', and this is obtained with `0,/abc/d'.
 786
 787 `[a-z]' is case insensitive
 788      You are encountering problems with locales.  POSIX mandates that
 789      `[a-z]' uses the current locale's collation order - in C parlance,
 790      that means using `strcoll(3)' instead of `strcmp(3)'.  Some
 791      locales have a case-insensitive collation order, others don't: one
 792      of those that have problems is Estonian.
 793
 794      Another problem is that `[a-z]' tries to use collation symbols.
 795      This only happens if you are on the GNU system, using GNU libc's
 796      regular expression matcher instead of compiling the one supplied
 797      with GNU sed.  In a Danish locale, for example, the regular
 798      expression `^[a-z]$' matches the string `aa', because this is a
 799      single collating symbol that comes after `a' and before `b'; `ll'
 800      behaves similarly in Spanish locales, or `ij' in Dutch locales.
 801
 802      To work around these problems, which may cause bugs in shell
 803      scripts, set the `LC_COLLATE' and `LC_CTYPE' environment variables
 804      to `C'.
 805
 806    ---------- Footnotes ----------
 807
 808    (1) which is the actual "bug" that prompted the change in behavior
 809
 810 \x1f
 811 File: sed.info,  Node: Extended regexps,  Next: Concept Index,  Prev: Reporting Bugs,  Up: Top
 812
 813 Extended regular expressions
 814 ****************************
 815
 816    The only difference between basic and extended regular expressions
 817 is in the behavior of a few characters: `?', `+', parentheses, and
 818 braces (`{}').  While basic regular expressions require these to be
 819 escaped if you want them to behave as special characters, when using
 820 extended regular expressions you must escape them if you want them _to
 821 match a literal character_.
 822
 823 Examples:
 824 `abc?'
 825      becomes `abc\?' when using extended regular expressions.  It
 826      matches the literal string `abc?'.
 827
 828 `c\+'
 829      becomes `c+' when using extended regular expressions.  It matches
 830      one or more `c's.
 831
 832 `a\{3,\}'
 833      becomes `a{3,}' when using extended regular expressions.  It
 834      matches three or more `a's.
 835
 836 `\(abc\)\{2,3\}'
 837      becomes `(abc){2,3}' when using extended regular expressions.  It
 838      matches either `abcabc' or `abcabcabc'.
 839
 840 `\(abc*\)\1'
 841      becomes `(abc*)\1' when using extended regular expressions.
 842      Backreferences must still be escaped when using extended regular
 843      expressions.
 844
 845 \x1f
 846 File: sed.info,  Node: Concept Index,  Next: Command and Option Index,  Prev: Extended regexps,  Up: Top
 847
 848 Concept Index
 849 *************
 850
 851    This is a general index of all issues discussed in this manual, with
 852 the exception of the `sed' commands and command-line options.
 853
 854 * Menu:
 855
 856 * Additional reading about sed:          Other Resources.
 857 * ADDR1,+N:                              Addresses.
 858 * ADDR1,~N:                              Addresses.
 859 * Address, as a regular expression:      Addresses.
 860 * Address, last line:                    Addresses.
 861 * Address, numeric:                      Addresses.
 862 * Addresses, in sed scripts:             Addresses.
 863 * Append hold space to pattern space:    Other Commands.
 864 * Append next input line to pattern space: Other Commands.
 865 * Append pattern space to hold space:    Other Commands.
 866 * Appending text after a line:           Other Commands.
 867 * Backreferences, in regular expressions: The "s" Command.
 868 * Branch to a label, if s/// failed:     Extended Commands.
 869 * Branch to a label, if s/// succeeded:  Programming Commands.
 870 * Branch to a label, unconditionally:    Programming Commands.
 871 * Buffer spaces, pattern and hold:       Execution Cycle.
 872 * Bugs, reporting:                       Reporting Bugs.
 873 * Case-insensitive matching:             The "s" Command.
 874 * Caveat -- #n on first line:            Common Commands.
 875 * Command groups:                        Common Commands.
 876 * Comments, in scripts:                  Common Commands.
 877 * Conditional branch <1>:                Extended Commands.
 878 * Conditional branch:                    Programming Commands.
 879 * Copy hold space into pattern space:    Other Commands.
 880 * Copy pattern space into hold space:    Other Commands.
 881 * Delete first line from pattern space:  Other Commands.
 882 * Disabling autoprint, from command line: Invoking sed.
 883 * empty regular expression:              Addresses.
 884 * Evaluate Bourne-shell commands:        Extended Commands.
 885 * Evaluate Bourne-shell commands, after substitution: The "s" Command.
 886 * Exchange hold space with pattern space: Other Commands.
 887 * Excluding lines:                       Addresses.
 888 * Extended regular expressions, choosing: Invoking sed.
 889 * Extended regular expressions, syntax:  Extended regexps.
 890 * Files to be processed as input:        Invoking sed.
 891 * Flow of control in scripts:            Programming Commands.
 892 * Global substitution:                   The "s" Command.
 893 * GNU extensions, /dev/stderr file <1>:  The "s" Command.
 894 * GNU extensions, /dev/stderr file:      Other Commands.
 895 * GNU extensions, /dev/stdin file <1>:   Other Commands.
 896 * GNU extensions, /dev/stdin file:       Extended Commands.
 897 * GNU extensions, /dev/stdout file <1>:  Invoking sed.
 898 * GNU extensions, /dev/stdout file <2>:  The "s" Command.
 899 * GNU extensions, /dev/stdout file:      Other Commands.
 900 * GNU extensions, 0 address:             Addresses.
 901 * GNU extensions, 0,ADDR2 addressing:    Addresses.
 902 * GNU extensions, ADDR1,+N addressing:   Addresses.
 903 * GNU extensions, ADDR1,~N addressing:   Addresses.
 904 * GNU extensions, branch if s/// failed: Extended Commands.
 905 * GNU extensions, case modifiers in s commands: The "s" Command.
 906 * GNU extensions, checking for their presence: Extended Commands.
 907 * GNU extensions, disabling:             Invoking sed.
 908 * GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands.
 909 * GNU extensions, evaluating Bourne-shell commands: The "s" Command.
 910 * GNU extensions, extended regular expressions: Invoking sed.
 911 * GNU extensions, g and NUMBER modifier interaction in s command: The "s" Command.
 912 * GNU extensions, I modifier <1>:        Addresses.
 913 * GNU extensions, I modifier:            The "s" Command.
 914 * GNU extensions, in-place editing <1>:  Reporting Bugs.
 915 * GNU extensions, in-place editing:      Invoking sed.
 916 * GNU extensions, L command:             Extended Commands.
 917 * GNU extensions, M modifier:            The "s" Command.
 918 * GNU extensions, modifiers and the empty regular expression: Addresses.
 919 * GNU extensions, N~M addresses:         Addresses.
 920 * GNU extensions, quitting silently:     Extended Commands.
 921 * GNU extensions, R command:             Extended Commands.
 922 * GNU extensions, reading a file a line at a time: Extended Commands.
 923 * GNU extensions, reformatting paragraphs: Extended Commands.
 924 * GNU extensions, returning an exit code <1>: Common Commands.
 925 * GNU extensions, returning an exit code: Extended Commands.
 926 * GNU extensions, setting line length:   Other Commands.
 927 * GNU extensions, special escapes <1>:   Reporting Bugs.
 928 * GNU extensions, special escapes:       Escapes.
 929 * GNU extensions, special two-address forms: Addresses.
 930 * GNU extensions, subprocesses <1>:      The "s" Command.
 931 * GNU extensions, subprocesses:          Extended Commands.
 932 * GNU extensions, to basic regular expressions <1>: Reporting Bugs.
 933 * GNU extensions, to basic regular expressions: Regular Expressions.
 934 * GNU extensions, two addresses supported by most commands: Other Commands.
 935 * GNU extensions, unlimited line length: Limitations.
 936 * GNU extensions, writing first line to a file: Extended Commands.
 937 * Goto, in scripts:                      Programming Commands.
 938 * Greedy regular expression matching:    Regular Expressions.
 939 * Grouping commands:                     Common Commands.
 940 * Hold space, appending from pattern space: Other Commands.
 941 * Hold space, appending to pattern space: Other Commands.
 942 * Hold space, copy into pattern space:   Other Commands.
 943 * Hold space, copying pattern space into: Other Commands.
 944 * Hold space, definition:                Execution Cycle.
 945 * Hold space, exchange with pattern space: Other Commands.
 946 * In-place editing:                      Reporting Bugs.
 947 * In-place editing, activating:          Invoking sed.
 948 * In-place editing, Perl-style backup file names: Invoking sed.
 949 * Inserting text before a line:          Other Commands.
 950 * Labels, in scripts:                    Programming Commands.
 951 * Last line, selecting:                  Addresses.
 952 * Line length, setting <1>:              Invoking sed.
 953 * Line length, setting:                  Other Commands.
 954 * Line number, printing:                 Other Commands.
 955 * Line selection:                        Addresses.
 956 * Line, selecting by number:             Addresses.
 957 * Line, selecting by regular expression match: Addresses.
 958 * Line, selecting last:                  Addresses.
 959 * List pattern space:                    Other Commands.
 960 * Mixing g and NUMBER modifiers in the s command: The "s" Command.
 961 * Next input line, append to pattern space: Other Commands.
 962 * Next input line, replace pattern space with: Common Commands.
 963 * Non-bugs, in-place editing:            Reporting Bugs.
 964 * Non-bugs, N command on the last line:  Reporting Bugs.
 965 * Non-bugs, regex syntax clashes:        Reporting Bugs.
 966 * Parenthesized substrings:              The "s" Command.
 967 * Pattern space, definition:             Execution Cycle.
 968 * Perl-style regular expressions, multiline: Addresses.
 969 * Portability, comments:                 Common Commands.
 970 * Portability, line length limitations:  Limitations.
 971 * Portability, N command on the last line: Reporting Bugs.
 972 * POSIXLY_CORRECT behavior, bracket expressions: Regular Expressions.
 973 * POSIXLY_CORRECT behavior, enabling:    Invoking sed.
 974 * POSIXLY_CORRECT behavior, escapes:     Escapes.
 975 * POSIXLY_CORRECT behavior, N command:   Reporting Bugs.
 976 * Print first line from pattern space:   Other Commands.
 977 * Printing line number:                  Other Commands.
 978 * Printing text unambiguously:           Other Commands.
 979 * Quitting <1>:                          Extended Commands.
 980 * Quitting:                              Common Commands.
 981 * Range of lines:                        Addresses.
 982 * Range with start address of zero:      Addresses.
 983 * Read next input line:                  Common Commands.
 984 * Read text from a file <1>:             Extended Commands.
 985 * Read text from a file:                 Other Commands.
 986 * Reformat pattern space:                Extended Commands.
 987 * Reformatting paragraphs:               Extended Commands.
 988 * Replace hold space with copy of pattern space: Other Commands.
 989 * Replace pattern space with copy of hold space: Other Commands.
 990 * Replacing all text matching regexp in a line: The "s" Command.
 991 * Replacing only Nth match of regexp in a line: The "s" Command.
 992 * Replacing selected lines with other text: Other Commands.
 993 * Requiring GNU sed:                     Extended Commands.
 994 * Script structure:                      sed Programs.
 995 * Script, from a file:                   Invoking sed.
 996 * Script, from command line:             Invoking sed.
 997 * sed program structure:                 sed Programs.
 998 * Selecting lines to process:            Addresses.
 999 * Selecting non-matching lines:          Addresses.
1000 * Several lines, selecting:              Addresses.
1001 * Slash character, in regular expressions: Addresses.
1002 * Spaces, pattern and hold:              Execution Cycle.
1003 * Special addressing forms:              Addresses.
1004 * Standard input, processing as input:   Invoking sed.
1005 * Stream editor:                         Introduction.
1006 * Subprocesses <1>:                      Extended Commands.
1007 * Subprocesses:                          The "s" Command.
1008 * Substitution of text, options:         The "s" Command.
1009 * Text, appending:                       Other Commands.
1010 * Text, deleting:                        Common Commands.
1011 * Text, insertion:                       Other Commands.
1012 * Text, printing:                        Common Commands.
1013 * Text, printing after substitution:     The "s" Command.
1014 * Text, writing to a file after substitution: The "s" Command.
1015 * Transliteration:                       Other Commands.
1016 * Unbuffered I/O, choosing:              Invoking sed.
1017 * Usage summary, printing:               Invoking sed.
1018 * Version, printing:                     Invoking sed.
1019 * Working on separate files:             Invoking sed.
1020 * Write first line to a file:            Extended Commands.
1021 * Write to a file:                       Other Commands.
1022 * Zero, as range start address:          Addresses.
1023
1024 \x1f
1025 File: sed.info,  Node: Command and Option Index,  Prev: Concept Index,  Up: Top
1026
1027 Command and Option Index
1028 ************************
1029
1030    This is an alphabetical list of all `sed' commands and command-line
1031 options.
1032
1033 * Menu:
1034
1035 * # (comments):                          Common Commands.
1036 * --expression:                          Invoking sed.
1037 * --file:                                Invoking sed.
1038 * --help:                                Invoking sed.
1039 * --in-place:                            Invoking sed.
1040 * --line-length:                         Invoking sed.
1041 * --quiet:                               Invoking sed.
1042 * --regexp-extended:                     Invoking sed.
1043 * --silent:                              Invoking sed.
1044 * --unbuffered:                          Invoking sed.
1045 * --version:                             Invoking sed.
1046 * -e:                                    Invoking sed.
1047 * -f:                                    Invoking sed.
1048 * -i:                                    Invoking sed.
1049 * -l:                                    Invoking sed.
1050 * -n:                                    Invoking sed.
1051 * -n, forcing from within a script:      Common Commands.
1052 * -r:                                    Invoking sed.
1053 * -u:                                    Invoking sed.
1054 * : (label) command:                     Programming Commands.
1055 * = (print line number) command:         Other Commands.
1056 * a (append text lines) command:         Other Commands.
1057 * b (branch) command:                    Programming Commands.
1058 * c (change to text lines) command:      Other Commands.
1059 * D (delete first line) command:         Other Commands.
1060 * d (delete) command:                    Common Commands.
1061 * e (evaluate) command:                  Extended Commands.
1062 * G (appending Get) command:             Other Commands.
1063 * g (get) command:                       Other Commands.
1064 * H (append Hold) command:               Other Commands.
1065 * h (hold) command:                      Other Commands.
1066 * i (insert text lines) command:         Other Commands.
1067 * L (fLow paragraphs) command:           Extended Commands.
1068 * l (list unambiguously) command:        Other Commands.
1069 * N (append Next line) command:          Other Commands.
1070 * n (next-line) command:                 Common Commands.
1071 * P (print first line) command:          Other Commands.
1072 * p (print) command:                     Common Commands.
1073 * q (quit) command:                      Common Commands.
1074 * Q (silent Quit) command:               Extended Commands.
1075 * r (read file) command:                 Other Commands.
1076 * R (read line) command:                 Extended Commands.
1077 * s command, option flags:               The "s" Command.
1078 * T (test and branch if failed) command: Extended Commands.
1079 * t (test and branch if successful) command: Programming Commands.
1080 * v (version) command:                   Extended Commands.
1081 * w (write file) command:                Other Commands.
1082 * W (write first line) command:          Extended Commands.
1083 * x (eXchange) command:                  Other Commands.
1084 * y (transliterate) command:             Other Commands.
1085 * {} command grouping:                   Common Commands.
1086
1087