1 <chapter id="misc-docs-directory_versioning">
2 <title>Directory Versioning</title>
7 <para>The three cardinal virtues of a master technologist
8 are: laziness, impatience, and hubris." —Larry
12 <para>This describes some of the theoretical pitfalls around the
13 (possibly arrogant) notion that one can simply version
14 directories just as one versions files.</para>
18 <!-- ================================================================= -->
19 <!-- ======================== SECTION 1 ============================== -->
20 <!-- ================================================================= -->
21 <sect1 id="misc-docs-directory_versioning-sect-1">
22 <title>Directory Revisions</title>
24 <para>To begin, recall that the Subversion repository is an array
25 of trees. Each tree represents the application of a new atomic
26 commit, and is called a <firstterm>revision</firstterm>. This
27 is very different from a CVS repository, which stores file
28 histories in a collection of RCS files (and doesn't track
29 tree-structure.)</para>
31 <para>So when we refer to <quote>revision 4 of
32 <filename>foo.c</filename></quote> (written
33 <filename>foo.c:4</filename>) in CVS, this means the fourth
34 distinct version of <filename>foo.c</filename>—but in
35 Subversion this means <quote>the version of
36 <filename>foo.c</filename> in the fourth revision
37 (tree)</quote>. It's quite possible that
38 <filename>foo.c</filename> has never changed at all since
39 revision 1! In other words, in Subversion, different revision
40 numbers of the same versioned item do <emphasis>not</emphasis>
41 imply different contents.</para>
43 <para>Nevertheless, the content of <filename>foo.c:4</filename>
44 is still well-defined. The file <filename>foo.c</filename> in
45 revision 4 has specific text and properties.</para>
47 <para>Suppose, now, that we extend this concept to directories.
48 If we have a directory <filename>DIR</filename>, define
49 <literal>DIR:N</literal> to be <quote>the directory DIR in the
50 fourth revision.</quote> The contents are defined to be a
51 particular set of directory entries (<literal>dirents</literal>)
52 and properties.</para>
54 <para>So far, so good. The concept of versioning directories
55 seems fine in the repository—the repository is very
56 theoretically pure anyway. However, because working copies
57 allow mixed revisions, it's easy to create problematic
62 <!-- ================================================================= -->
63 <!-- ======================== SECTION 2 ============================== -->
64 <!-- ================================================================= -->
65 <sect1 id="misc-docs-directory_versioning-sect-2">
66 <title>The Lagging Directory</title>
68 <sect2 id="misc-docs-directory_versioning-sect-2.1">
69 <title>The Problem</title>
71 <para><emphasis>This is the first part of the <quote>Greg
72 Hudson</quote> problem, so named because he was the first
73 one to bring it up and define it well.</emphasis> :-)</para>
75 <para>Suppose our working copy has directory
76 <filename>DIR:1</filename> containing file
77 <filename>foo:1</filename>, along with some other files. We
78 remove <filename>foo</filename> and commit.</para>
80 <para>Already, we have a problem: our working copy still claims
81 to have <filename>DIR:1</filename>. But on the repository,
82 revision 1 of <filename>DIR</filename> is
83 <emphasis>defined</emphasis> to contain
84 <filename>foo</filename>—and our working copy
85 <filename>DIR</filename> clearly does not have it anymore.
86 How can we truthfully say that we still have
87 <filename>DIR:1</filename>?</para>
89 <para>One answer is to force <filename>DIR</filename> to be
90 updated when we commit <filename>foo</filename>'s deletion.
91 Assuming that our commit created revision 2, we would
92 immediately update our working copy to
93 <filename>DIR:2</filename>. Then the client and server would
94 both agree that <filename>DIR:2</filename> does not contain
95 foo, and that <filename>DIR:2</filename> is indeed exactly
96 what is in the working copy.</para>
98 <para>This solution has nasty, un-user-friendly side effects,
99 though. It's likely that other people may have committed
100 before us, possibly adding new properties to
101 <filename>DIR</filename>, or adding a new file
102 <filename>bar</filename>. Now pretend our committed deletion
103 creates revision 5 in the repository. If we instantly update
104 our local <filename>DIR</filename> to 5, that means
105 unexpectedly receiving a copy of <filename>bar</filename> and
106 some new propchanges. This clearly violates a UI principle:
107 ``the client will never change your working copy until you ask
108 it to.'' Committing changes to the repository is a
109 server-write operation only; it should
110 <emphasis>not</emphasis> modify your working data!</para>
112 <para>Another solution is to do the naive thing: after
113 committing the deletion of <filename>foo</filename>, simply
114 stop tracking the file in the <filename>.svn</filename>
115 administrative directory. The client then loses all knowledge
118 <para>But this doesn't work either: if we now update our working
119 copy, the communication between client and server is
120 incorrect. The client still believes that it has
121 <filename>DIR:1</filename>—which is false, since a
122 <quote>true</quote> <filename>DIR:1</filename> contains
123 <filename>foo</filename>. The client gives this incorrect
124 report to the repository, and the repository decides that in
125 order to update to revision 2, <filename>foo</filename> must
126 be deleted. Thus the repository sends a bogus (or at least
127 unnecessary) deletion command.</para>
131 <sect2 id="misc-docs-directory_versioning-sect-2.2">
132 <title>The Solution</title>
134 <para>After deleting <filename>foo</filename> and committing,
135 the file is <emphasis>not</emphasis> totally forgotten by the
136 <filename>.svn</filename> directory. While the file is no
137 longer considered to be under version control, it is still
138 secretly remembered as having been
139 <quote>deleted</quote>.</para>
141 <para>When the user updates the working copy, the client
142 correctly informs the server that the file is already missing
143 from its local <filename>DIR:1</filename>; therefore the
144 repository doesn't try to re-delete it when patching the
145 client up to revision 2.</para>
148 <title>Note to developers</title>
150 <para>How the <quote>deleted</quote>
151 flag works under the hood.</para>
156 <para>The <command>svn status</command> command won't
157 display a deleted item, unless you make the deleted item
158 the specific target of status.</para>
162 <para>When a deleted item's parent is updated, one of two
163 things will happen:</para>
167 <para>The repository will re-add the item, thereby
168 overwriting the entire entry. (no more
169 <quote>deleted</quote> flag)</para>
172 <para>The repository will say nothing about the item,
173 which means that it's fully aware that your item is
174 gone, and this is the correct state to be in. In
175 this case, the entire entry is removed. (no more
176 <quote>deleted</quote> flag)</para>
182 <para>If a user schedules an item for addition that has
183 the same name as a <quote>deleted</quote> entry, then
184 entry will have both flags simultaneously. This is
185 perfectly fine:</para>
189 <para>The commit-crawler will notice both flags and
190 do a <function>delete()</function> and then an
191 <function>add()</function>. This ensures that the
192 transaction is built correctly. (without the
193 <function>delete()</function>, the
194 <function>add()</function> would be on top of an
195 already-existing item.)</para>
198 <para>When the commit completes, the client rewrites
199 the entry as normal. (no more
200 <quote>deleted</quote> flag)</para>
212 <!-- ================================================================= -->
213 <!-- ======================== SECTION 3 ============================== -->
214 <!-- ================================================================= -->
215 <sect1 id="misc-docs-directory_versioning-sect-3">
216 <title>The Overeager Directory</title>
218 <para><emphasis>This is the 2nd part of the <quote>Greg
219 Hudson</quote> problem.</emphasis></para>
221 <sect2 id="misc-docs-directory_versioning-sect-3.1">
222 <title>The Problem</title>
224 <para>Again, suppose our working copy has directory
225 <filename>DIR:1</filename> containing file
226 <filename>foo:1</filename>, along with some other files.
229 <para>Now, unbeknownst to us, somebody else adds a new file
230 <filename>bar</filename> to this directory, creating revision
231 2 (and <filename>DIR:2</filename>).</para>
233 <para>Now we add a property to <filename>DIR</filename> and
234 commit, which creates revision 3. Our working-copy
235 <filename>DIR</filename> is now marked as being at revision
238 <para>Of course, this is false; our working copy does
239 <emphasis>not</emphasis> have <filename>DIR:3</filename>,
240 because the <quote>true</quote> <filename>DIR:3</filename> on
241 the repository contains the new file <filename>bar</filename>.
242 Our working copy has no knowledge of <filename>bar</filename>
245 <para>Again, we can't follow our commit of
246 <filename>DIR</filename> with an automatic update (and
247 addition of <filename>bar</filename>). As mentioned
248 previously, commits are a one-way write operation; they must
249 not change working copy data.</para>
253 <sect2 id="misc-docs-directory_versioning-sect-3.2">
254 <title>The Solution</title>
256 <para>Let's enumerate exactly those times when a directory's
257 local revision number changes:</para>
262 <term>When a directory is updated:</term>
264 <para>If the directory is either the direct target of an
265 update command, or is a child of an updated directory,
266 it will be bumped (along with many other siblings and
267 children) to a uniform revision number.</para>
272 <term>When a directory is committed:</term>
274 <para>A directory can only be considered a
275 <quote>committed object</quote> if it has a new property
276 change. (Otherwise, to <quote>commit a
277 directory</quote> really implies that its modified
278 children are being committed, and only such children
279 will have local revisions bumped.)</para>
285 <para>In this light, it's clear that our <quote>overeager
286 directory</quote> problem only happens in the second
287 situation—those times when we're committing directory
290 <para>Thus the answer is simply not to allow property-commits on
291 directories that are out-of-date. It sounds a bit
292 restrictive, but there's no other way to keep directory
293 revisions accurate.</para>
296 <title>Note to developers</title>
298 <para>This restriction is enforced by the filesystem merge()
301 <para>Once <function>merge()</function> has established that
302 {ancestor, source, target} are all different node-rev-ids,
303 it examines the property-keys of ancestor and target. If
304 they're <emphasis>different</emphasis>, it returns a
305 conflict error.</para>
312 <!-- ================================================================= -->
313 <!-- ======================== SECTION 4 ============================== -->
314 <!-- ================================================================= -->
315 <sect1 id="misc-docs-directory_versioning-sect-4">
316 <title>User Impact</title>
318 <para>Really, the Subversion client seems to have two
319 difficult—almost contradictory—goals.</para>
321 <para>First, it needs to make the user experience friendly, which
322 generally means being a bit <quote>sloppy</quote> about deciding
323 what a user can or cannot do. This is why it allows
324 mixed-revision working copies, and why it tries to let users
325 execute local tree-changing operations (delete, add, move, copy)
326 in situations that aren't always perfectly, theoretically
327 <quote>safe</quote> or pure.
330 <para>Second, the client tries to keep the working copy in
331 correctly in sync with the repository using as little
332 communication as possible. Of course, this is made much harder
333 by the first goal!</para>
335 <para>So in the end, there's a tension here, and the resolutions
336 to problems can vary. In one case (the <quote>lagging
337 directory</quote>), the problem can be solved through a bit of
338 clever entry tracking in the client. In the other case
339 (<quote>the overeager directory</quote>), the only solution is
340 to restrict some of the theoretical laxness allowed by the
349 sgml-parent-document: ("misc-docs.xml" "chapter")