1 .\ Format this file with:
2 .\ pic file | tbl | troff -ms
4 .\ PS and PE center pic diagrams. (The corresponding ms-macros may not.)
6 .nr pE (\\n(.lu-\\$2u)/2u
29 RCS \- A System for Version Control
34 Department of Computer Sciences
36 West Lafayette, Indiana 47907
39 An important problem in program development and maintenance is version control,
40 i.e., the task of keeping a software system consisting of many versions and
41 configurations well organized.
42 The Revision Control System (RCS)
43 is a software tool that assists with that task.
44 RCS manages revisions of text documents, in particular source programs,
45 documentation, and test data.
46 It automates the storing, retrieval, logging and identification of revisions,
47 and it provides selection mechanisms for composing configurations.
48 This paper introduces basic version control concepts and
49 discusses the practice of version control
51 For conserving space, RCS stores deltas, i.e., differences between
52 successive revisions. Several delta storage methods are discusses.
53 Usage statistics show that RCS's delta storage method is
54 space and time efficient.
55 The paper concludes with a detailed survey of version control tools.
57 \fBKeywords\fR: configuration management, history management,
58 version control, revisions, deltas.
61 A version of this paper was published in Software--Practice and Experience,
62 Vol. 15(7), 637-654 (July 1985).
72 Version control is the task of keeping software
73 systems consisting of many versions and configurations well organized.
74 The Revision Control System (RCS) is a set of UNIX
75 commands that assist with that task.
77 RCS' primary function is to manage \fIrevision groups\fR.
78 A revision group is a set of text documents, called \fIrevisions\fR,
79 that evolved from each other. A new revision is
80 created by manually editing an existing one.
81 RCS organizes the revisions into an ancestral tree. The initial revision
82 is the root of the tree, and the tree edges indicate
83 from which revision a given one evolved.
84 Besides managing individual revision groups, RCS provides
85 flexible selection functions for composing configurations.
86 RCS may be combined with MAKE\u1\d,
87 resulting in a powerful package for version control.
89 RCS also offers facilities for
90 merging updates with customer modifications,
91 for distributed software development, and
92 for automatic identification.
93 Identification is the `stamping'
94 of revisions and configurations with unique markers.
95 These markers are akin to serial numbers,
96 telling software maintainers unambiguously which configuration
99 RCS is designed for both production and experimental
101 In production environments,
102 access controls detect update conflicts and prevent overlapping changes.
103 In experimental environments, where strong controls are
104 counterproductive, it is possible to loosen the controls.
106 Although RCS was originally intended for programs, it is useful for any
107 text that is revised frequently and whose previous revisions must be
108 preserved. RCS has been applied successfully to store the source
109 text for drawings, VLSI layouts, documentation, specifications,
110 test data, form letters and articles.
112 This paper discusses the practice of
113 version control using RCS.
114 It also introduces basic version control concepts,
115 useful for clarifying current practice and designing similar systems.
116 Revision groups of individual components are treated in the next three sections,
117 and the extensions to configurations follow.
118 Because of its size, a survey of version control tools
119 appears at the end of the paper.
121 Getting started with RCS
123 Suppose a text file \fIf.c\fR is to be placed under control of RCS.
124 Invoking the check-in command
128 creates a new revision group with the contents of
129 \fIf.c\fR as the initial
130 revision (numbered 1.1)
131 and stores the group into the file \fIf.c,v\fR.
132 Unless told otherwise, the command deletes \fIf.c\fR.
133 It also asks for a description of the group.
134 The description should state the common purpose of all revisions in the group,
135 and becomes part of the group's documentation.
136 All later check-in commands will ask for a log entry,
137 which should summarize the changes made.
138 (The first revision is assigned a default log message,
139 which just records the fact that it is the initial revision.)
141 Files ending in \fI,v\fR
142 are called \fIRCS files\fR (\fIv\fR stands for \fIv\fRersions);
143 the others are called working files.
144 To get back the working file \fIf.c\fR in the previous example,
145 execute the check-out command:
150 This command extracts the latest revision from
151 the revision group \fIf.c,v\fR and writes
153 The file \fIf.c\fR can now be edited and, when finished,
154 checked back in with \fIci\fR:
158 \fICi\fR assigns number 1.2 to
160 If \fIci\fR complains with the message
162 ci error: no lock set by <login>
164 then the system administrator has decided to configure RCS for a
165 production environment by enabling the `strict locking feature'.
166 If this feature is enabled, all RCS files are initialized
167 such that check-in operations require a lock on the previous revision
168 (the one from which the current one evolved).
169 Locking prevents overlapping modifications if several people work on the same file.
170 If locking is required, the revision should
171 have been locked during the check-out by using
176 Of course it is too late now for the check-out with locking, because
177 \fIf.c\fR has already been changed; checking out the file again
178 would overwrite the modifications.
179 (To prevent accidental overwrites, \fIco\fR senses the presence
180 of a working file and asks whether the user really intended to overwrite it.
181 The overwriting check-out is sometimes useful for
182 backing up to the previous revision.)
183 To be able to proceed with the check-in in the present case, first execute
187 This command retroactively locks the latest revision, unless someone
188 else locked it in the meantime. In this case, the two programmers
189 involved have to negotiate whose
190 modifications should take precedence.
192 If an RCS file is private, i.e., if only the owner of the file is expected
193 to deposit revisions into it, the strict locking feature is unnecessary and
195 If strict locking is disabled,
196 the owner of the RCS file need not have a lock for check-in.
197 For safety reasons, all others
198 still do. Turning strict locking off and on is done with the commands:
200 rcs -U f.c \fRand\fI rcs -L f.c
202 These commands enable or disable the strict locking feature for each RCS file
204 The system administrator only decides whether strict locking is
207 To reduce the clutter in a working directory, all RCS files can be moved
208 to a subdirectory with the name \fIRCS\fR.
209 RCS commands look first into that directory for RCS files.
210 All the commands presented above work
211 with the \fIRCS\fR subdirectory without change.\(dg
213 \(dg Pairs of RCS and working files can actually be specified in 3 ways:
214 a) both are given, b) only the working file is given, c) only the
216 If a pair is given, both files may have arbitrary path prefixes;
217 RCS commands pair them up intelligently.
220 It may be undesirable that \fIci\fR deletes the working file.
221 For instance, sometimes one would like to save the current revision,
222 but continue editing.
227 checks in \fIf.c\fR as usual, but performs an additional
228 check-out with locking afterwards. Thus, the working file does
229 not disappear after the check-in.
230 Similarly, the option
231 \fI-u\fR does a check-in followed by a check-out without
232 locking. This option is useful if the file is needed for compilation after the check-in.
233 Both options update the identification markers in the working file
236 Besides the operations \fIci\fR and \fIco\fR, RCS provides the following
238 \fIident\fR (extract identification markers),
239 \fIrcs\fR (change RCS file attributes),
240 \fIrcsclean\fR (remove unchanged working files),
241 \fIrcsdiff\fR (compare revisions),
242 \fIrcsfreeze\fR (record a configuration),
243 \fIrcsmerge\fR (merge revisions),
244 and \fIrlog\fR (read log messages and other information in RCS files).
245 A synopsis of these commands appears in the Appendix.
247 Automatic Identification
249 RCS can stamp source and object code with special identification strings,
250 similar to product and serial numbers.
251 To obtain such identification, place the marker
253 $Header: /pub/NetBSD/misc/repositories/cvsroot/src/usr.bin/rcs/doc/Attic/rcs.ms,v 1.1 1993/03/21 09:58:02 cgd Exp $
255 into the text of a revision, for instance inside a comment.
256 The check-out operation will replace this marker with a string of the form
258 $Header: /pub/NetBSD/misc/repositories/cvsroot/src/usr.bin/rcs/doc/Attic/rcs.ms,v 1.1 1993/03/21 09:58:02 cgd Exp $
260 This string need never be touched, because \fIco\fR keeps it
261 up to date automatically.
262 To propagate the marker into object code, simply put
263 it into a literal character string. In C, this is done as follows:
265 static char rcsid[] = "$Header: /pub/NetBSD/misc/repositories/cvsroot/src/usr.bin/rcs/doc/Attic/rcs.ms,v 1.1 1993/03/21 09:58:02 cgd Exp $";
267 The command \fIident\fR extracts such markers from any file, in particular from
269 \fIIdent\fR helps to find out
270 which revisions of which modules were used in a given program.
271 It returns a complete and unambiguous component list,
272 from which a copy of the program can be reconstructed.
273 This facility is invaluable for program maintenance.
275 There are several additional identification markers, one for each component
276 of $Header: /pub/NetBSD/misc/repositories/cvsroot/src/usr.bin/rcs/doc/Attic/rcs.ms,v 1.1 1993/03/21 09:58:02 cgd Exp $.
281 has a similar function. It accumulates
282 the log messages that are requested during check-in.
283 Thus, one can maintain the complete history of a revision directly inside it,
284 by enclosing it in a comment.
285 Figure 1 is a partial reproduction of a log contained in revision 4.1 of
286 the file \fIci.c\fR. The log appears at the beginning of the file,
287 and makes it easy to determine what the recent modifications were.
295 * Revision 4.1 83/05/10 17:03:06 wft
296 * Added option -d and -w, and updated assignment of date, etc. to new delta.
297 * Added handling of default branches.
299 * Revision 3.9 83/02/15 15:25:44 wft
300 * Added call to fastcopy() to copy remainder of RCS file.
302 * Revision 3.8 83/01/14 15:34:05 wft
303 * Added ignoring of interrupts while new RCS file is renamed;
304 * avoids deletion of RCS files by interrupts.
306 * Revision 3.7 82/12/10 16:09:20 wft
307 * Corrected checking of return code from diff.
308 * An RCS file now inherits its mode during the first ci from the working file,
309 * except that write permission is removed.
313 Figure 1. Log entries produced by the marker $Log$.
319 Since revisions are stored in the form of differences,
321 physically stored once,
322 independent of the number of revisions present.
323 Thus, the $Log$ marker incurs negligible space overhead.
325 The RCS Revision Tree
327 RCS arranges revisions in an ancestral tree.
328 The \fIci\fR command builds this tree; the auxiliary command \fIrcs\fR
330 The tree has a root revision, normally numbered 1.1, and successive revisions
331 are numbered 1.2, 1.3, etc. The first field of a revision number
332 is called the \fIrelease number\fR and the second one
333 the \fIlevel number\fR. Unless given explicitly,
334 the \fIci\fR command assigns a new revision number
335 by incrementing the level number of the previous revision.
336 The release number must be incremented explicitly, using the
337 \fI-r\fR option of \fIci\fR.
338 Assuming there are revisions 1.1, 1.2, and 1.3 in the RCS file f.c,v, the command
340 ci -r2.1 f.c \fRor\fI ci -r2 f.c
342 assigns the number 2.1 to the new revision.
343 Later check-ins without the \fI-r\fR option will assign the numbers 2.2, 2.3,
345 The release number should be incremented only at major transition points
346 in the development, for instance when a new release of a software product has
349 When are branches needed?
351 A young revision tree is slender:
352 It consists of only one branch, called the trunk.
353 As the tree ages, side branches may form.
354 Branches are needed in the following 4 situations.
355 .IP "\fITemporary fixes\fR"
357 Suppose a tree has 5 revisions grouped in 2 releases,
358 as illustrated in Figure 2.
359 Revision 1.3, the last one of release 1, is in operation at customer sites,
360 while release 2 is in active development.
377 Figure 2. A slender revision tree.
379 Now imagine a customer requesting a fix of
380 a problem in revision 1.3, although actual development has moved on
381 to release 2. RCS does not permit an extra
382 revision to be spliced in between 1.3 and 2.1, since that would not reflect
383 the actual development history. Instead, create a branch
384 at revision 1.3, and check in the fix on that branch.
385 The first branch starting at 1.3 has number 1.3.1, and
386 the revisions on that branch are numbered 1.3.1.1, 1.3.1.2, etc.
387 The double numbering is needed to allow for another
388 branch at 1.3, say 1.3.2.
389 Revisions on the second branch would be numbered
390 1.3.2.1, 1.3.2.2, and so on.
391 The following steps create
392 branch 1.3.1 and add revision 1.3.1.1:
400 %co -r1.3 f.c % -- check out revision 1.3
401 %edit f.c % -- change it
402 %ci -r1.3.1 f.c % -- check it in on branch 1.3.1
407 This sequence of commands transforms the tree of Figure 2 into
409 Note that it may be necessary to incorporate the differences
410 between 1.3 and 1.3.1.1
411 into a revision at level 2. The operation \fIrcsmerge\fR automates this
412 process (see the Appendix).
426 line invis down from R21.s
428 arrow dashed right from RB1.e
429 arrow from R13.s to RB1.w
433 Figure 3. A revision tree with one side branch
435 .IP "\fIDistributed development and customer modifications\fR"
437 Assume a situation as in Figure 2, where revision 1.3 is in operation
438 at several customer sites,
439 while release 2 is in development.
440 Customer sites should use RCS to store the distributed software.
441 However, customer modifications should not be placed on the same branch
442 as the distributed source; instead, they should be placed on a side branch.
443 When the next software distribution arrives,
444 it should be appended to the trunk of
445 the customer's RCS file, and the customer
446 can then merge the local modifications back into the new release.
447 In the above example, a
448 customer's RCS file would contain the following tree, assuming
449 that the customer has received revision 1.3, added his local modifications
450 as revision 1.3.1.1, then received revision 2.4, and merged
451 2.4 and 1.3.1.1, resulting in 2.4.1.1.
465 arrow from R13.e to R24.w
466 line invis down from R21.s
468 arrow from R13.s to RB1.w
470 line invis down from R25.s
472 arrow from R24.s to RB2.w
476 Figure 4. A customer's revision tree with local modifications.
478 This approach is actually practiced in the CSNET project,
479 where several universities and a company cooperate
480 in developing a national computer network.
481 .IP "\fIParallel development\fR"
483 Sometimes it is desirable to explore an alternate design or
484 a different implementation technique in parallel with the
485 main line development. Such development
486 should be carried out on a side branch.
487 The experimental changes may later be moved into the main line, or abandoned.
488 .IP "\fIConflicting updates\fR"
490 A common occurrence is that one programmer
491 has checked out a revision, but cannot complete the assignment
492 for some reason. In the meantime, another person
493 must perform another modification
494 immediately. In that case, the second person should check-out the same revision,
495 modify it, and check it in on a side branch, for later merging.
497 Every node in a revision tree consists of the following attributes:
498 a revision number, a check-in date and time, the author's identification,
499 a log entry, a state and the actual text. All these attributes
500 are determined at the time the revision is checked in.
501 The state attribute indicates the status of a revision.
502 It is set automatically to `experimental' during check-in.
503 A revision can later be promoted to a higher status, for example
504 `stable' or `released'. The set of states is user-defined.
506 Revisions are represented as deltas
508 For conserving space, RCS stores revisions in the form
509 of deltas, i.e., as differences between revisions.
510 The user interface completely hides this fact.
512 A delta is a sequence of edit commands that transforms one string
513 into another. The deltas employed by RCS are line-based, which means
514 that the only edit commands allowed are insertion and deletion of lines.
515 If a single character in a line is changed, the
516 edit scripts consider the entire line changed.
517 The program \fIdiff\fR\u2\d
518 produces a small, line-based delta between pairs of text files.
519 A character-based edit script would take much longer to compute,
520 and would not be significantly shorter.
522 Using deltas is a classical space-time tradeoff: deltas reduce the
523 space consumed, but increase access time.
524 However, a version control tool should impose as little delay
525 as possible on programmers.
526 Excessive delays discourage the use of version controls,
527 or induce programmers to take shortcuts that compromise system integrity.
528 To gain reasonably fast access time for both editing and compiling,
529 RCS arranges deltas in the following way.
530 The most recent revision on the trunk is stored intact.
531 All other revisions on the trunk are stored as reverse deltas.
532 A reverse delta describes how to go backward in the development history:
533 it produces the desired revision if applied to the successor of that revision.
534 This implementation has the advantage
535 that extraction of the latest revision is a simple and fast copy
537 Adding a new revision to the trunk is also fast: \fIci\fR simply
538 adds the new revision intact, replaces the previous
539 revision with a reverse delta, and keeps the rest of the old deltas.
540 Thus, \fIci\fR requires the computation
541 of only one new delta.
543 Branches need special treatment. The naive solution would be to
544 store complete copies for the tips of all branches.
545 Clearly, this approach would cost too much space. Instead,
546 RCS uses \fIforward\fR deltas for branches. Regenerating a revision
547 on a side branch proceeds as follows. First, extract the latest revision
548 on the trunk; secondly, apply reverse deltas until the fork revision for
549 the branch is obtained; thirdly, apply forward deltas until the desired
550 branch revision is reached. Figure 5 illustrates a tree with
551 one side branch. Triangles pointing to the left and right represent
552 reverse and forward deltas, respectively.
556 define BD X [line invis $1 right .5;
557 line up .3 then left .5 down .3 then right .5 down .3 then up .3] X
559 define FD X [line invis $1 right .5;
560 line left .5 down .3 then up .6 then right .5 down .3;] X
564 arrow right from D11.e
566 arrow right from D12.e
568 arrow right from D13.e
570 arrow right from D21.e
572 line invis down from D21.s
574 arrow from D13.se to F1.w
575 arrow from F1.e right
581 Figure 5. A revision tree with reverse and forward deltas.
584 Although implementing fast check-out for the latest trunk revision,
585 this arrangement has the disadvantage that generation of other revisions
586 takes time proportional to the number of deltas applied. For example,
587 regenerating the branch tip in Figure 5 requires application of five
588 deltas (including the initial one). Since usage statistics show that
589 the latest trunk revision is the one that is retrieved in 95 per cent
590 of all cases (see the section on usage statistics), biasing check-out time
591 in favor of that revision results in significant savings.
592 However, careful implementation of the delta application process is
593 necessary to provide low retrieval overhead for other revisions, in
594 particular for branch tips.
596 There are several techniques for delta application.
597 The naive one is to pass each delta to a general-purpose text editor.
598 A prototype of RCS invoked the UNIX editor \fIed\fR both
599 for applying deltas and for expanding the identification markers.
600 Although easy to implement, performance was poor, owing to the
601 high start-up costs and excess generality of \fIed\fR. An intermediate
602 version of RCS used a special-purpose, stream-oriented editor.
603 This technique reduced the cost of applying a delta to the cost of
604 checking out the latest trunk revision. The reason for this behavior
605 is that each delta application involves a complete pass over
606 the preceding revision.
608 However, there is a much better algorithm. Note that the deltas are
609 line oriented and that most of the work of a stream editor involves
610 copying unchanged lines from one revision to the next. A faster
611 algorithm avoids unnecessary copying of character strings by using
613 A piece table is a one-dimensional array, specifying how a given
614 revision is `pieced together' from lines in the RCS file.
615 Suppose piece table \fIPT\dr\u\fR represents revision \fIr\fR.
616 Then \fIPT\dr\u[i]\fR contains the starting position of line \fIi\fR
618 Application of the next delta transforms piece table \fIPT\dr\u\fR
619 into \fIPT\dr+1\u\fR. For instance, a delete command removes a
620 series of entries from the piece table. An insertion command inserts
621 new entries, moving the entries following the insertion point further down the
622 array. The inserted entries point to the text lines in the delta.
623 Thus, no I/O is involved except for reading the delta itself. When all
624 deltas have been applied to the piece table, a sequential pass
625 through the table looks up each line in the RCS file and copies it to
626 the output file, updating identification markers at the same time.
627 Of course, the RCS file must permit random access, since the copied
628 lines are scattered throughout that file. Figure 6 illustrates an
629 RCS file with two revisions and the corresponding piece tables.
633 Figure 6. An RCS file and its piece tables
636 The piece table approach has the property that the time for applying a single
637 delta is roughly determined by the size of the delta, and not by the
638 size of the revision. For example, if a delta is
639 10 per cent of the size of a revision, then applying it takes only
640 10 per cent of the time to generate the latest trunk revision. (The stream
641 editor would take 100 per cent.)
643 There is an important alternative for representing deltas that affects
644 performance. SCCS\u3\d,
645 a precursor of RCS, uses \fIinterleaved\fR deltas.
646 A file containing interleaved deltas is partitioned into blocks of lines.
647 Each block has a header that specifies to which revision(s) the block
648 belongs. The blocks are sorted out in such a way that a single
649 pass over the file can pick up all the lines belonging to a given
650 revision. Thus, the regeneration time for all revisions is the same:
651 all headers must be inspected, and the associated blocks either copied
652 or skipped. As the number of revisions increases, the cost of retrieving
653 any revision is much higher than the cost of checking out the
654 latest trunk revision with reverse deltas. A detailed comparison
655 of SCCS's interleaved deltas and RCS's reverse deltas can be found
657 This reference considers the version of RCS with the
658 stream editor only. The piece table method improves performance
659 further, so that RCS is always faster than SCCS, except if 10
660 or more deltas are applied.
662 Additional speed-up for both delta methods can be obtained by caching
663 the most recently generated revision, as has been implemented in DSEE.\u5\d
664 With caching, access time to frequently used revisions can approach normal file
665 access time, at the cost of some additional space.
667 Locking: A Controversial Issue
669 The locking mechanism for RCS was difficult to design.
670 The problem and its solution are first presented in their `pure' form,
671 followed by a discussion of the complications
672 caused by `real-world' considerations.
674 RCS must prevent two or more persons from depositing competing changes of the
676 Suppose two programmers check out revision 2.4 and
677 modify it. Programmer A checks in a revision before programmer B.
678 Unfortunately, programmer B has not seen A's
679 changes, so the effect is that A's changes are covered up by B's deposit.
680 A's changes are not lost since all revisions
681 are saved, but they are confined to a single revision\(dd.
683 \(dd Note that this problem is entirely different from the atomicity problem.
685 concurrent update operations on the same RCS file cannot be permitted,
686 because that may result in inconsistent data.
687 Atomic updates are essential (and implemented in RCS),
688 but do not solve the conflict discussed here.
691 This conflict is prevented in RCS by locking.
692 Whenever someone intends to edit a revision (as opposed
693 to reading or compiling it), the revision should be checked out
695 using the \fI-l\fR option on \fIco\fR. On subsequent check-in,
696 \fIci\fR tests the lock and then removes it.
697 At most one programmer at a time may
698 lock a particular revision, and only this programmer may check in
699 the succeeding revision.
700 Thus, while a revision is locked, it is the exclusive responsibility
703 An important maxim for software tools like RCS is that they must
704 not stand in the way of making progress with a project.
705 This consideration leads to several weakenings of the locking mechanism.
706 First of all, even if a revision is locked, it can
707 still be checked out. This is necessary if other people
708 wish to compile or inspect the locked revision
709 while the next one is in preparation. The only operations they
710 cannot do are to lock the revision or to check in the succeeding one. Secondly,
711 check-in operations on other branches in the RCS file are still possible; the
712 locking of one revision does not affect any other revision.
713 Thirdly, revisions are occasionally locked for a long period of time
714 because a programmer is absent or otherwise unable to complete
715 the assignment. If another programmer has to make a pressing change,
716 there are the following three alternatives for making progress:
717 a) find out who is holding the lock and ask that person to release it;
718 b) check out the locked revision, modify it, check it
719 in on a branch, and merge the changes later;
720 c) break the lock. Breaking a lock leaves a highly visible
721 trace, namely an electronic mail message that is sent automatically to the
722 holder of the lock, recording the breaker and a commentary requested from him.
723 Thus, breaking locks is tolerated under certain circumstances,
724 but will not go unnoticed.
725 Experience has shown that the automatic mail message attaches a high enough
726 stigma to lock breaking,
727 such that programmers break locks only in real emergencies,
728 or when a co-worker resigns and leaves locked revisions behind.
730 If an RCS file is private, i.e., when a programmer owns an RCS file
731 and does not expect anyone else to perform check-in operations,
732 locking is an unnecessary nuisance.
734 the `strict locking feature' discussed earlier may be disabled,
735 provided that file protection
736 is set such that only the owner may write the RCS file.
737 This has the effect that only the owner can check-in revisions,
738 and that no lock is needed for doing so.
741 each RCS file contains an access list that specifies the users
742 who may execute update operations. If an access list is empty,
743 only normal UNIX file protection applies. Thus, the access list is
744 useful for restricting the set of people who would otherwise have update
745 permission. Just as with locking, the access list
746 has no effect on read-only operations such as \fIco\fR. This approach
747 is consistent with the UNIX philosophy of openness, which contributes
748 to a productive software development environment.
750 Configuration Management
752 The preceding sections described how RCS deals with revisions of individual
753 components; this section discusses how to handle configurations.
754 A configuration is a set of revisions, where each revision comes
755 from a different revision group, and the revisions are selected
756 according to a certain criterion.
758 in order to build a functioning compiler, the `right'
759 revisions from the scanner, the parser, the optimizer
760 and the code generator must be combined.
761 RCS, in conjunction with MAKE,
762 provides a number of facilities to effect a smooth selection.
764 RCS Selection Functions
766 .IP "\fIDefault selection\fR"
768 During development, the usual selection criterion is to choose
769 the latest revision of all components. The \fIco\fR command
770 makes this selection by default. For example, the command
774 retrieves the latest revision on the default branch of each RCS file
775 in the current directory.
776 The default branch is usually the trunk, but may be
777 set to be a side branch.
778 Side branches as defaults are needed in distributed software development,
779 as discussed in the section on the RCS revision tree.
781 .IP "\fIRelease based selection\fR"
783 Specifying a release or branch number selects the latest revision in
784 that release or branch.
789 retrieves the latest revision with release number 2 from each RCS file.
790 This selection is convenient if a release has been completed and
791 development has moved on to the next release.
793 .IP "\fIState and author based selection\fR"
795 If the highest level number within a given release number
796 is not the desired one,
797 the state attribute can help. For example,
799 co -r2 -sReleased *,v
801 retrieves the latest revision with release number 2 whose state attribute
803 Of course, the state attribute has to be set appropriately, using the
804 \fIci\fR or \fIrcs\fR commands.
805 Another alternative is to select a revision by its author,
806 using the \fI-w\fR option.
808 .IP "\fIDate based selection\fR"
810 Revisions may also be selected by date.
811 Suppose a release of an entire system was
812 completed and current on March 4, at 1:00 p.m. Then the command
814 co -d"March 4, 1:00 pm" *,v
816 checks out all the components of that release, independent of the numbering.
817 The \fI-d\fR option specifies a `cutoff date', i.e.,
818 the revision selected has a check-in date that
819 is closest to, but not after the date given.
821 .IP "\fIName based selection\fR"
823 The most powerful selection function is based on assigning symbolic
824 names to revisions and branches.
825 In large systems, a single release number or date is not sufficient
826 to collect the appropriate revisions from all groups.
827 For example, suppose one wishes to combine release 2
828 of one subsystem and release 15 of another.
829 Most likely, the creation dates of those releases differ also.
830 Thus, a single revision number or date passed to the \fIco\fR command
831 will not suffice to select the right revisions.
832 Symbolic revision numbers solve this problem.
833 Each RCS file may contain a set of symbolic names that are mapped
834 to numeric revision numbers. For example, assume
835 the symbol \fIV3\fR is bound to release number 2 in file \fIs,v\fR, and to
836 revision number 15.9 in \fIt,v\fR.
837 Then the single command
841 retrieves the latest revision of release 2 from \fIs,v\fR,
842 and revision 15.9 from \fIt,v\fR.
843 In a large system with many modules, checking out all
844 revisions with one command greatly simplifies configuration management.
846 Judicious use of symbolic revision numbers helps with organizing
847 large configurations.
848 A special command, \fIrcsfreeze\fR,
849 assigns a symbolic revision number to a selected revision
851 \fIRcsfreeze\fR effectively freezes a configuration.
852 The assigned symbolic revision number selects all components
853 of the configuration.
854 If necessary, symbolic numbers
855 may even be intermixed with numeric ones. Thus, \fIV3.5\fR in the
857 would select revision 2.5 in \fIs,v\fR and branch 15.9.5 in \fIt,v\fR.
859 The options \fI-r\fR, \fI-s\fR, \fI-w\fR and \fI-d\fR
860 may be combined. If a branch is given, the latest revision
861 on that branch satisfying all conditions is retrieved;
862 otherwise, the default branch is used.
864 Combining MAKE and RCS
867 is a program that processes configurations.
868 It is driven by configuration specifications
869 recorded in a special file, called a `Makefile'.
870 MAKE avoids redundant processing steps
871 by comparing creation dates of source and processed objects.
872 For example, when instructed to compile all
873 modules of a given system, it only recompiles
874 those source modules that were changed
875 since they were processed last.
877 MAKE has been extended with an auto-checkout feature for RCS.
878 When a certain file to be processed is not present,
879 MAKE attempts a check-out operation.
880 If successful, MAKE performs the required processing, and then deletes
881 the checked out file to conserve space.
882 The selection parameters discussed above can be passed to MAKE
883 either as parameters, or directly embedded in the Makefile.
884 MAKE has also been extended to search the subdirectory named \fIRCS\fR
885 for needed files, rather than just the current working directory.
886 However, if a working file is present, MAKE totally ignores the corresponding
887 RCS file and uses the working file.
888 (In newer versions of MAKE distributed by AT&T and others,
890 achieved with the rule DEFAULT, instead of a special extension of MAKE.
891 However, a file checked out by the rule DEFAULT
892 will not be deleted after processing. \fIRcsclean\fR can be
893 used for that purpose.)
895 With auto-checkout, RCS/MAKE can effect a selection rule
896 especially tuned for multi-person software development and maintenance.
898 programmers should obtain configurations that consist of
899 the revisions they have personally checked out plus the latest
900 checked in revision of all other revision groups.
901 This schema can be set up as follows.
903 Each programmer chooses a working directory
904 and places into it a symbolic link, named \fIRCS\fR,
905 to the directory containing the relevant RCS files.
906 The symbolic link makes sure that \fIco\fR and \fIci\fR
907 operations need only specify the working files, and that
908 the Makefile need not be changed.
909 The programmer then checks out the needed files and modifies them.
911 it composes configurations by selecting those
912 revisions that are checked out, and the rest from the
913 subdirectory \fIRCS\fR.
914 The latter selection may be controlled by a symbolic
915 revision number or any of the other selection criteria.
916 If there are several programmers editing in separate working directories,
917 they are insulated from each other's changes until checking in their
920 Similarly, a maintainer can recreate an older configuration
921 by starting to work in an empty working directory.
922 During the initial MAKE invocation, all revisions are selected from RCS files.
923 As the maintainer checks out files and modifies them,
924 a new configuration is gradually built up.
925 Every time MAKE is invoked, it substitutes the modified revisions
926 into the configuration being manipulated.
928 A final application of RCS is to use it for storing Makefiles.
929 Revision groups of Makefiles represent
930 multiple versions of configurations.
931 Whenever a configuration is baselined or distributed,
932 the best approach is to unambiguously fix
933 the configuration with a symbolic revision number by calling
935 to embed that symbol into the Makefile, and to
936 check in the Makefile (using the same symbolic revision number).
937 With this approach, old configurations
938 can be regenerated easily and reliably.
942 The following usage statistics were collected on two DEC VAX-11/780
943 computers of the Purdue Computer Science Department. Both machines
944 are mainly used for research purposes. Thus, the data
945 reflect an environment in which the majority of projects
946 involve prototyping and advanced software development,
947 but relatively little long-term maintenance.
949 For the first experiment,
950 the \fIci\fR and \fIco\fR operations were instrumented
951 to log the number of backward and forward deltas applied.
952 The data were collected during a 13 month period
953 from Dec. 1982 to Dec. 1983.
954 Table I summarizes the results.
963 Operation#Total#Total deltas#Mean deltas#Operations#Branch
964 #operations #applied#applied#with >1 delta#operations
966 co # 7867# 9320#1.18#509#(6%)#203#(3%)
967 ci # 3468# 2207#0.64# 85#(2%)# 75#(2%)
968 ci & co#11335#11527#1.02#594#(5%)#278#(2%)
971 Table I. Statistics for \fIco\fR and \fIci\fR operations.
975 The first two lines show statistics for check-out and check-in;
976 the third line shows the combination.
977 Recall that \fIci\fR performs an implicit check-out to obtain
978 a revision for computing the delta.
979 In all measures presented, the most recent revision (stored intact)
980 counts as one delta. The number of deltas applied represents
981 the number of passes necessary, where the first `pass' is a copying step.
983 Note that the check-out operation is executed more than
984 twice as frequently as the check-in operation.
985 The fourth column gives the mean number of deltas
986 applied in all three cases.
987 For \fIci\fR, the mean number of deltas applied is less
989 The reasons are that the initial check-in requires no delta at all, and that
990 the only time \fIci\fR requires more than one delta is for branches.
991 Column 5 shows the actual number of operations that applied more than one
993 The last column indicates that branches were not used often.
995 The last three columns demonstrate that the most recent trunk revision
996 is by far the most frequently accessed.
997 For RCS, check-out of
998 this revision is a simple copy operation, which is the absolute minimum
999 given the copy-semantics of \fIco\fR.
1000 Access to older revisions and branches
1001 is more common in non-academic environments,
1002 yet even if access to older deltas were an order
1003 of magnitude more frequent,
1004 the combined average number of deltas applied would still be below 1.2.
1005 Since RCS is faster than SCCS until up to 10 delta applications,
1006 reverse deltas are clearly the method of choice.
1008 The second experiment, conducted in March of 1984,
1009 involved surveying the existing RCS files
1010 on our two machines. The goal was to determine the mean number of
1011 revisions per RCS file, as well as the space consumed by them.
1012 Table II shows the results. (Tables I and II were produced at different
1013 times and are unrelated.)
1019 c | c | c | c | c | c | c
1020 c | c | c | c | c | c | c
1021 l | n | n | n | n | n | n
1023 #Total RCS#Total#Mean#Mean size of#Mean size of#Overhead
1024 #files#revisions#revisions#RCS files#revisions
1026 All files #8033#11133#1.39#6156#5585#1.10
1027 Files with#1477# 4578#3.10#8074#6041#1.34
1031 Table II. Statistics for RCS files.
1035 The mean number of revisions per RCS file is 1.39.
1036 Columns 5 and 6 show the mean sizes (in bytes) of an RCS file
1037 and of the latest revision of each RCS file, respectively.
1038 The `overhead' column contains the ratio of the mean sizes.
1039 Assuming that all revisions in an RCS file are approximately the same size,
1040 this ratio gives a measure of the space consumed by the extra revisions.
1042 In our sample, over 80 per cent of the RCS files contained only a single revision.
1043 The reason is that our
1044 systems programmers routinely check in all source files
1045 on the distribution tapes, even though they may never touch them again.
1046 To get a better indication of how much space savings are possible
1047 with deltas, all measures with those files
1048 that contained 2 or more revisions were recomputed. Only for those files
1050 As shown in the second line, the average number of revisions for those files is
1051 3.10, with an overhead of 1.34. This means that the extra 2.10 deltas
1052 require 34 per cent extra space, or
1053 16 per cent per extra revision.
1055 measured the space consumed by SCCS, and
1056 reported an average of 5 revisions per group
1057 and an overhead of 1.37 (or about 9 per cent per extra revision).
1058 In a later paper, Glasser\u6\d
1059 observed an average of 7 revisions per group in a single, large project,
1060 but provided no overhead figure.
1061 In his paper on DSEE\u5\d,
1062 Leblang reported that delta storage combined with blank compression
1063 results in an overhead of a mere 1\-2 per cent per revision.
1064 Since leading blanks accounted for about 20 per cent of the surveyed Pascal
1065 programs, a revision group with 5\-10 members was smaller
1066 than a single cleartext copy.
1068 The above observations demonstrate clearly that the space needed
1069 for extra revisions is small. With delta storage, the luxury of
1070 keeping multiple revisions online is certainly affordable.
1071 In fact, introducing a system with delta storage may reduce
1072 storage requirements, because programmers often save back-up copies
1073 anyway. Since back-up copies are stored much more efficiently with deltas,
1074 introducing a system such as RCS may
1075 actually free a considerable amount of space.
1077 Survey of Version Control Tools
1079 The need to keep back-up copies of software arose when
1080 programs and data were no longer stored on paper media, but were entered
1081 from terminals and stored on disk.
1082 Back-up copies are desirable for reliability, and many modern editors
1083 automatically save a back-up copy for every file touched.
1085 is valuable for short-term back-ups, but not suitable for long-term
1086 version control, since an existing back-up copy is overwritten whenever the
1087 corresponding file is edited.
1089 Tape archives are suitable for long-term, offline storage.
1090 If all changed files are dumped on a back-up tape once per day, old revisions
1091 remain accessible. However, tape archives are unsatisfactory
1092 for version control in several ways. First, backing up the file
1093 system every 24 hours does not capture intermediate revisions.
1094 Secondly, the old revisions are not online,
1095 and accessing them is tedious and time-consuming.
1096 In particular, it is impractical to
1097 compare several old revisions of a group,
1098 because that may require mounting and searching several tapes.
1099 Tape archives are important fail-safe tools in the
1100 event of catastrophic disk failures or accidental deletions,
1101 but they are ill-suited for version control.
1102 Conversely, version control tools do not obviate the
1103 need for tape archives.
1105 A natural technique for keeping several old revisions online is
1106 to never delete a file.
1108 simply creates a new file with the same
1109 name, but with a different sequence number.
1110 This technique, available as an option in DEC's VMS operating system,
1111 turns out to be inadequate for version control.
1112 First, it is prohibitively expensive in terms of storage costs,
1113 especially since no data compression techniques are employed.
1114 Secondly, indiscriminately storing every change produces too many
1115 revisions, and programmers have difficulties distinguishing them.
1116 The proliferation of revisions forces programmers to spend much time on
1117 finding and deleting useless files.
1118 Thirdly, most of the support functions like locking, logging,
1120 and identification described in this paper are not available.
1122 An alternative approach is to separate editing from revision control.
1123 The user may repeatedly edit a given revision,
1124 until freezing it with an explicit command.
1125 Once a revision is frozen, it is stored permanently and can no longer be modified.
1126 (In RCS, freezing a revisions is done with \fIci\fR.)
1127 Editing a frozen revision implicitly creates a new one, which
1128 can again be changed repeatedly until it is frozen itself.
1129 This approach saves exactly those revisions that the user
1130 considers important, and keeps the number of revisions manageable.
1131 IBM's CLEAR/CASTER\u7\d,
1135 are examples of version control systems using this approach.
1136 CLEAR/CASTER maintains a data base of programs, specifications,
1137 documentation and messages, using deltas.
1138 Its goal is to provide control over the development process from a
1139 management viewpoint.
1140 SCCS stores multiple revisions of source text in an ancestral tree,
1141 records a log entry for each revision,
1142 provides access control, and has facilities
1143 for uniquely identifying each revision.
1144 An efficient delta technique
1145 reduces the space consumed by each revision group.
1146 SDC is much simpler than SCCS because it stores not more than
1147 two revisions. However, it maintains a complete log for all old
1148 revisions, some of which may be on back-up tape.
1149 CMS, like SCCS, manages tree-structured revision groups,
1150 but offers no identification mechanism.
1152 Tools for dealing with configurations are still in a state of flux.
1153 SCCS, SDC and CMS can be combined with MAKE or MAKE-like programs.
1154 Since flexible selection rules are missing from all these tools,
1155 it is sometimes difficult
1156 to specify precisely which revision of each group
1157 should be passed to MAKE for building a desired configuration.
1158 The Xerox Cedar system\u10\d
1159 provides a `System Modeller' that can rebuild
1160 a configuration from an arbitrary set of module revisions.
1161 The revisions of a module are only distinguished by creation time,
1162 and there is no tool for managing groups.
1163 Since the selection rules are primitive,
1164 the System Modeller appears to be somewhat tedious to use.
1166 is a sophisticated software engineering environment.
1167 It manages revision groups in a way similar to SCCS and CMS. Configurations
1168 are built using `configuration threads'.
1169 A configuration thread states which revision of each group
1170 named in a configuration should be chosen.
1171 A configuration thread may contain dynamic specifiers
1172 (e.g., `choose the revisions I am currently working on,
1173 and the most recent revisions otherwise'), which are bound
1174 automatically at build time.
1175 It also provides a notification mechanism for alerting
1176 maintainers about the need to rebuild a system after a change.
1178 RCS is based on a general model for describing
1179 multi-version/multi-configuration systems\u11\d.
1180 The model describes systems using AND/OR graphs, where AND nodes represent
1181 configurations, and OR nodes represent version groups.
1182 The model gives rise to a suit of selection rules for
1183 composing configurations, almost all of which are implemented in RCS.
1184 The revisions selected by RCS are passed to MAKE for configuration building.
1185 Revision group management is modelled after SCCS.
1186 RCS retains SCCS's best features,
1187 but offers a significantly simpler user interface,
1188 flexible selection rules, adequate integration with MAKE
1189 and improved identification.
1190 A detailed comparison of RCS and SCCS appears in Reference 4.
1192 An important component of all revision control systems
1193 is a program for computing deltas.
1194 SCCS and RCS use the program \fIdiff\fR\u2\d,
1195 which first computes the longest common substring of two
1196 revisions, and then produces the delta from that substring.
1197 The delta is simply an edit script consisting of deletion and
1198 insertion commands that generate one revision from the other.
1200 A delta based on a longest common substring is not necessarily minimal,
1201 because it does not take advantage of crossing block moves.
1202 Crossing block moves arise if two or more blocks of lines
1204 appear in a different order in two revisions.
1205 An edit script derived from a longest common substring
1206 first deletes the shorter of the two blocks, and then reinserts it.
1208 proposed an algorithm for detecting block moves, but
1209 since the algorithm is based on heuristics,
1210 there are conditions
1211 under which the generated delta is far from minimal.
1212 DSEE uses this algorithm combined with blank compression,
1213 apparently with satisfactory overall results.
1214 A new algorithm that is guaranteed to produce a minimal delta based on
1215 block moves appears in Reference 13.
1216 A future release of RCS will use this algorithm.
1218 \fIAcknowledgements\fR:
1219 Many people have helped make RCS a success by contributed criticisms, suggestions,
1220 corrections, and even whole new commands (including manual pages).
1221 The list of people is too long to be
1222 reproduced here, but my sincere thanks for their help and
1223 goodwill goes to all of them.
1228 Appendix: Synopsis of RCS Operations
1230 .IP "\fIci \fB\- check in revisions\fR"
1232 \fICi\fR stores the contents of a working file into the
1233 corresponding RCS file as a new revision.
1234 If the RCS file doesn't exist, \fIci\fR creates it.
1235 \fICi\fR removes the working file, unless one of the options
1236 \fI-u\fR or \fI-l\fR is present.
1237 For each check-in, \fIci\fR asks for a commentary
1238 describing the changes relative to the previous revision.
1240 \fICi\fR assigns the revision number given by the \fI-r\fR option;
1241 if that option is missing, it derives the number from the
1242 lock held by the user; if there is no lock and locking is not strict,
1243 \fIci\fR increments the number of the latest revision on the trunk.
1244 A side branch can only be started by explicitly specifying its
1245 number with the \fI-r\fR option during check-in.
1247 \fICi\fR also determines
1248 whether the revision to be checked in is different from the
1249 previous one, and asks whether to proceed if not.
1250 This facility simplifies check-in operations for large systems,
1251 because one need not remember which files were changed.
1253 The option \fI-k\fR searches the checked in file for identification
1256 revision number, check-in date, author and state, and assigns these
1257 to the new revision rather than computing them. This option is
1258 useful for software distribution: Recipients of distributed software
1259 using RCS should check in updates with the \fI-k\fR option.
1260 This convention guarantees that revision numbers, check-in dates,
1261 etc., are the same at all sites.
1262 .IP "\fIco \fB\- check out revisions\fR"
1264 \fICo\fR retrieves revisions according to revision number,
1265 date, author and state attributes. It either places the revision
1266 into the working file, or prints it on the std. output.
1267 \fICo\fR always expands the identification markers.
1268 .IP "\fIident \fB\- extract identification markers\fR"
1270 \fIIdent\fR extracts the identification markers expanded by \fIco\fR
1271 from any file and prints them.
1272 .IP "\fIrcs \fB\- change RCS file attributes\fR"
1274 \fIRcs\fR is an administrative operation that changes access lists,
1275 locks, unlocks, breaks locks, toggles the strict-locking feature,
1276 sets state attributes and symbolic revision numbers, changes the
1277 description, and deletes revisions. A revision can
1278 only be deleted if it is not the fork of a side branch.
1279 .IP "\fIrcsclean \fB\- clean working directory\fR"
1281 \fIRcsclean\fR removes working files that were checked out but never changed.
1282 .IP "\fIrcsdiff \fB\- compare revisions\fR"
1284 \fIRcsdiff\fR compares two revisions and prints their
1285 difference, using the UNIX tool \fIdiff\fR.
1286 One of the revisions compared may be checked out.
1287 This command is useful for finding out about changes.
1288 .IP "\fIrcsfreeze \fB\- freeze a configuration\fR"
1290 \fIRcsfreeze\fR assigns the same symbolic revision number
1291 to a given revision in all RCS files.
1292 This command is useful for accurately recording a configuration.
1293 .IP "\fIrcsmerge \fB\- merge revisions\fR"
1295 \fIRcsmerge\fR merges two revisions, \fIrev1\fR and \fIrev2\fR,
1296 with respect to a common ancestor.
1297 A 3-way file comparison determines the segments of lines that
1298 are (a) the same in all three revisions, or (b) the same in 2 revisions,
1299 or (c) different in all three. For all segments of type (b) where
1300 \fIrev1\fR is the differing revision,
1301 the segment in \fIrev1\fR replaces the corresponding segment of \fIrev2\fR.
1302 Type (c) indicates an overlapping change, is flagged as an error, and requires user
1303 intervention to select the correct alternative.
1304 .IP "\fIrlog \fB\- read log messages\fR"
1306 \fIRlog\fR prints the log messages and other information in an RCS file.
1316 .ds [A Feldman, Stuart I.
1318 .ds [T Make - A Program for Maintaining Computer Programs
1319 .ds [J Software -- Practice and Experience
1327 .][ 1 journal-article
1331 .ds [T An Algorithm for Differential File Comparison
1332 .ds [A Hunt, James W.
1333 .as [A " and McIlroy, M. D.
1334 .ds [I Computing Science Technical Report, Bell Laboratories
1344 .ds [A Rochkind, Marc J.
1346 .ds [T The Source Code Control System
1347 .ds [J IEEE Transactions on Software Engineering
1355 .][ 1 journal-article
1359 .ds [T Design, Implementation, and Evaluation of a Revision Control System
1360 .ds [A Tichy, Walter F.
1361 .ds [B Proceedings of the 6th International Conference on Software Engineering
1362 .ds [I ACM, IEEE, IPS, NBS
1363 .ds [D September 1982
1369 .][ 3 article-in-book
1373 .ds [A Leblang, David B.
1374 .as [A " and Chase, Robert P.
1375 .ds [T Computer-Aided Software Engineering in a Distributed Workstation Environment
1376 .ds [O Proceedings of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium
1377 .as [O " on Practical Software Development Environments.
1378 .ds [J SIGPLAN Notices
1387 .][ 1 journal-article
1393 .ds [A Glasser, Alan L.
1395 .ds [T The Evolution of a Source Code Control System
1396 .ds [J Software Engineering Notes
1401 .ds [O Proceedings of the Software Quality and Assurance Workshop.
1405 .][ 1 journal-article
1409 .ds [K IBMClearCaster
1412 .ds [T The Clear/Caster System
1413 .ds [J Nato Conference on Software Engineering, Rome
1417 .][ 1 journal-article
1422 .ds [A Habermann, A. Nico
1424 .ds [T A Software Development Control System
1425 .ds [I Technical Report, Carnegie-Mellon University, Department of Computer Science
1434 .ds [T Code Management System
1435 .ds [I Digital Equipment Corporation
1436 .ds [O Document No. EA-23134-82
1445 .ds [A Lampson, Butler W.
1446 .as [A " and Schmidt, Eric E.
1447 .ds [T Practical Use of a Polymorphic Applicative Language
1448 .ds [B Proceedings of the 10th Symposium on Principles of Programming Languages
1456 .][ 3 article-in-book
1461 .ds [T A Data Model for Programming Support Environments and its Application
1462 .ds [A Tichy, Walter F.
1463 .ds [B Automated Tools for Information System Design and Development
1464 .ds [E Hans-Jochen Schneider and Anthony I. Wasserman
1466 .ds [I North-Holland Publishing Company
1471 .][ 3 article-in-book
1477 .ds [T A Technique for Isolating Differences Between Files
1479 .ds [J Communications of the ACM
1488 .][ 1 journal-article
1492 .ds [T The String-to-String Correction Problem with Block Moves
1493 .ds [A Tichy, Walter F.
1495 .ds [J ACM Transactions on Computer Systems
1503 .][ 1 journal-article