1 Implementing Sparse Directory Support in SVN
3 #########################################################################
5 ### Note: This feature used to be called "incomplete directories"; ###
6 ### It is now called "sparse directories", because "incomplete" ###
7 ### made it sound like something was wrong with your directories. ###
9 #########################################################################
17 4. Implementation Strategy
18 5. Compatibility Matters
24 This design document started out as a post by Eric Gillespie:
26 http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=117053
27 From: Eric Gillespie <epg@pretzelnet.org>
28 To: dev@subversion.tigris.org
29 Subject: [PROPOSAL] Incomplete working copies (issue #695)
30 Date: Thu, 22 Jun 2006 22:35:06 -0700
31 Message-ID: <25668.1151040906@gould.diplodocus.org>
33 [The design has evolved since then; the text below is not exactly
34 the same as what Eric posted, but has the same general ideas.]
36 I'd like to propose a new solution to this issue, and hopefully get
37 it into 1.5. What i'm really looking for is the kind of
38 flexibility Perforce has with its client specs in which parts of a
41 I don't think Ben Reser's proposal
42 (http://svn.haxx.se/dev/archive-2005-07/0398.shtml) covers this.
43 Using his first example, there is no way to avoid pulling in
44 trunk/foo/images/another-big-dir when it is added.
46 This is based on an idea from Karl Fogel.
48 Implementing Incomplete Directory Support in SVN
49 ==================================================
51 Many users have very large trees of which they only want to
52 checkout certain parts. checkout -N is not today up to this task.
53 This proposal introduces the --depth option to the checkout,
54 switch, and update subcommands as a replacement for -N, which
55 allows working copies to have very specific contents, leaving out
56 everything the user does not want.
58 This is similar to Perforce's client specs, but without the ability
59 to have a repository entry have a different name in the working
60 copy. We actually already have this capability in switch.
64 We have a new "depth" field in .svn/entries, which has (currently)
65 four possible values: depth-empty, depth-files, depth-immediates,
66 and depth-infinity. Only this_dir entries may have depths other
69 depth-empty ------> Updates will not pull in any files or
70 subdirectories not already present.
72 depth-files ------> Updates will pull in any files not already
73 present, but not subdirectories.
75 depth-immediates -> Updates will pull in any files or
76 subdirectories not already present; those
77 subdirectories' this_dir entries will
80 depth-infinity ---> Updates will pull in any files or
81 subdirectories not already present; those
82 subdirectories' this_dir entries will
83 have depth-infinity. Equivalent to
84 today's default update behavior.
86 The --depth option sets depth values as it updates the working
87 copy, setting any new subdirectories' this_dir depth values as
101 The -N option becomes a synonym for --depth=files for these commands.
102 This changes the existing -N behavior for these commands, but in a
103 trivial way (see below).
105 checkout without --depth or -N behaves the same as it does today.
106 switch and update without --depth or -N behave the same way as
107 today IFF the working copy is fully depth-infinity. switch and
108 update without --depth or -N will NOT change depth values
109 (exception: a missing directory specified on the command line will
112 Thus, 'checkout' is identical to 'checkout --depth=infinity', but
113 'switch' and 'update' are not the same as 'switch --depth=infinity' and
114 'update --depth=infinity'. The former update entries according to
115 existing depth values, while the latter pull in everything.
117 To get started, run checkout with --depth=empty or --depth=files.
118 If additional files or directories are desired, pull them in with
119 update commands using appropriate --depth options.
121 The 'svn status' should list the depth status of the directories, in
122 addition to whatever statuses are being currently listed.
124 The 'svn info' command should list the depth, iff invoked on a
125 directory whose depth is not the default (depth infinity).
132 Same as today; everything has depth-infinity.
134 svn co -N http://.../A
136 Today, this creates wc containing only mu. Now, this will be
137 identical to 'svn co --depth=files /A'.
139 svn co --depth=empty http://.../A Awc
141 Creates wc Awc, but *empty*.
143 Awc/.svn/entries this_dir depth-empty
145 svn co --depth=files http://.../A Awc1
147 Creates wc Awc1 with all files (i.e., Awc1/mu) but no
150 Awc1/.svn/entries this_dir depth-files
153 svn co --depth=immediates http://.../A Awc2
155 Creates wc Awc2 with all files and all subdirectories, but
156 subdirectories are *empty*.
158 Awc2/.svn/entries this_dir depth-immediates
161 Awc2/B/.svn/entries this_dir depth-empty
162 Awc2/C/.svn/entries this_dir depth-empty
167 Since B is not yet checked out, add it at depth infinity.
169 Awc/.svn/entries this_dir depth-empty
171 Awc/B/.svn/entries this_dir depth-infinity
173 Awc/B/E/.svn/entries this_dir depth-infinity
179 Since A is already checked out, don't change its depth, just
180 update it. B and everything under it is at depth-infinity,
181 so it will be updated just as today.
183 svn up --depth=immediates Awc/D
185 Since D is not yet checked out, add it at depth-immediates.
187 Awc/.svn/entries this_dir depth-empty
190 Awc/D/.svn/entries this_dir depth-immediates
192 Awc/D/G/.svn/entries this_dir depth-empty
195 svn up --depth=empty Awc/B/E
197 Remove everything under E, but leave E as an empty directory
198 since B is depth-infinity.
200 Awc/.svn/entries this_dir depth-empty
203 Awc/B/.svn/entries this_dir depth-infinity
205 Awc/B/E/.svn/entries this_dir depth-empty
208 svn up --depth=empty Awc/D
210 Remove everything under D, and D itself since A is depth-empty.
212 Awc/.svn/entries this_dir depth-empty
217 Bring D back at depth-infinity.
219 Awc/.svn/entries this_dir depth-empty
221 Awc/D/.svn/entries this_dir depth-infinity
225 svn up --depth=immediates Awc
227 Bring in everything that's missing (C/ and mu) and empty all
228 subdirectories (and set their this_dir to depth-empty).
230 Awc/.svn/entries this_dir depth-immediates
233 Awc/B/.svn/entries this_dir depth-empty
234 Awc/C/.svn/entries this_dir depth-empty
237 4. Implementation Strategy
238 ==========================
240 It would be nice if all this could be accomplished with just simple
241 tweaks to how we drive the update reporter (svn_ra_reporter2_t).
242 However, it looks like it's not going to be that easy.
244 Handling 'checkout --depth=empty' would be easy. It should get us
245 an empty directory at depth-empty, with no files and no subdirs,
246 and if we just report it as at HEAD every time, the server will
247 never send updates down (hmmm, this could be a problem for getting
248 dir property updates, though). Then any files or subdirs we have
249 explicitly included we can just report at their respective
250 revisions, and get proper updates; at least that'll work for the
253 But consider 'checkout --depth=immediates'. The desired state is a
254 depth-immediates directory D, with all files up-to-date, and with
255 skeleton subdirs at depth empty. Plain updates should preserve this
258 If we report D as at its BASE revision, files at their BASE
259 revisions, and subdirs at HEAD, then:
261 - When new files appear in the repos, they'll get sent down (good)
262 - When new subdirs appear, they'll get sent down in full (bad)
264 But if we don't report subdirs as at HEAD, then the server will try to
265 update them (bad). And if we report D at HEAD, then the working copy
266 won't receive new files that have appeared in the repository since D's
267 BASE revision (note that we *can* get updates for files we already
268 have, though, by continuing to report them at their respective BASEs).
270 The same logic applies to subdirectories at depth-files or
273 So, I think this means that for efficient depth handling, we'll
274 need to have the client directly reporting the desired depth to the
275 server; i.e., extending the RA protocol.
277 Meanwhile, legacy servers will send back a bunch of information the
278 client doesn't want, and the client will just ignore it, and
279 everything will be slower than it needs to be, and people will
280 complain on the users@ list, and we'll tell them to upgrade their
281 servers, and they'll say they can't because they don't have control
282 over the server, and we'll say "So? This ain't no Grand Hotel!"
284 5. Compatibility Matters
285 ========================
287 This feature introduces two new concepts into the RA protocol which
288 will not be understood by older servers:
290 * Reported Depths -- the depths associated with individual paths
291 included by the client in the description (via the
292 svn_ra_reporter_t) of its working copy state.
294 * Requested Depth -- the single depth value used to limit the
295 scope of the server's response to the client.
297 As such, it's useful to understand how these concepts will be
298 handled across the compatibility matrix of depth-aware and
299 non-depth-aware clients and servers.
301 NOTE: in the sections below, it is not necessarily that case that a
302 value or state which is said to be "transmitted" literally has a
303 presence in the RA protocol. Some such bits of state have default
304 values in the protocol and can therefore be effectively transmitted
305 while not literally identifiable in a network trace of the
306 client-server traffic.
308 Depth-aware Clients (DACs)
310 DACs will transmit reported depths (with "infinity" as the
311 default) and will transmit a requested depth (with "unknown" as
312 the default). They will also -- for the sake of older,
313 non-depth-aware servers (NDASs) -- transmit a requested recurse
314 value derived from the requested depth:
324 When speaking to an NDAS, the requested recurse value is the
325 only thing the server understands , but is obviously more
326 "grainy" than the requested depth concept. The DAC, therefore,
327 must filter out any additional, unwanted data that the server
328 transmits in its response. (This filtering will happen in the
329 RA implementation itself so the RA APIs behave as expected
330 regardless of the server's pedigree.)
332 When speaking to a depth-aware server (DAS), the requested
333 recurse value is ignored. A requested depth of "unknown" means
334 "only send information about the stuff in my report,
335 depth-aware-ily". Other requested depth values are honored by
336 the server properly, and the DAC must handle the transformation
337 of any working copy depths from their pre-update to their
338 post-update depths and content as described in `3. Examples'.
340 Non-depth-aware Clients (NDACs)
342 NDACs will never transmit reported depths and never transmit a
343 requested depth. But they will transmit a requested recurse
344 value (either "yes" or "no", with "yes" being the default). (A
345 DAS uses the presence of a requested depth in the actual protocol
346 to distinguish DACs from NDACs, and knows to ignore the
347 requested recurse value transmitted by a DAC.)
349 When speaking to an NDAS, what happens happens. It's the past,
350 man -- you don't get to define the interaction this late in the
353 When speaking to a DAS, the not-reported depths are treated like
354 reported depths of "infinity", and the reported recurse values
355 "yes" and "no" map to depths of "infinity" and "files",
361 The sparse-directories code is merged to trunk in revision r23994.
363 A new enum type 'svn_depth_t depth' is defined in svn_types.h.
364 Both client and server side now understand the concept of depth,
365 and the basic update use cases handle depth. See depth_tests.py
366 for what is known to be working. (Many cases are not yet tested,
367 and almost certainly some of them will fail right now.)
369 On the client side, most of the svn_client.h interfaces that
370 formerly took 'svn_boolean_t recurse' now take 'svn_depth_t depth'.
371 (The -N option is deprecated, but it still works: it simply maps to
372 --depth=files, which results in the same behavior as -N used to.)
374 Some of this recurse-becomes-depth change has propagated down into
375 libsvn_wc, which now stores a depth field in svn_wc_entry_t (and
376 therefore in .svn/entries). The update reporter knows to report
377 differing depths to the server, in the same way it already reports
378 differing revisions. In other words, take the concept of "mixed
379 revision" working copies and extend it to "mixed depth" working
382 On the server side, most of the significant changes are in
383 libsvn_repos/reporter.c. The code that receives update reports now
384 receives notice of paths that have different depths from their
385 parent, and of course the overall update operation has a global
386 depth, which applies whenever not shadowed by some local depth for
389 The RA code on both sides knows how to send and receive depths; the
390 relevant svn_ra_* APIs now take depth arguments, which sometimes
391 supersede older 'recurse' booleans. In these cases, the RA layer
392 does the usual compatibility dance: receiving "recurse=FALSE" from
393 an older client causes the server to behave as if "depth=files"
394 had been transmitted.
398 The list of outstanding issues is shown by this issue tracker query
399 of Summary prefixed with [sparse-directories]:
401 <http://subversion.tigris.org/issues/buglist.cgi?component=subversion&issue_status=NEW&issue_status=STARTED&issue_status=REOPENED&short_desc=%5Bsparse-directories%5D&short_desc_type=casesubstring>