2 Auto-versioning Research Notes
3 ==============================
5 [Note from sussman: if you don't understand rfc 2518 (webdav) and rfc
6 3253 (deltav) intimately, you'll probably not understand these notes.
7 Read the rfcs, and also read the 'webdav-general-summary' notes in
8 this directory as a quick review.]
11 Phase 1: a lone PUT results in an immediate commit. This can be done
12 purely via libsvn_fs, using an auto-generated log message.
13 This covers the "drag-n-drop" use-case -- when a user simply
14 drops a file into a mounted repository.
16 Phase 2: come up with a system for dealing with the more common
17 class-2 DAV sequence: LOCK, GET, PUT, PUT, PUT, UNLOCK.
18 This covers most DAV clients, such as MSOffice and OpenOffice.
20 On first glance, it seems that Phase 1 should be doable by simply
21 noticing a PUT on a public URI, and triggering a commit. But
22 apparently this completely circumvents the fact that mod_dav *already*
23 has a notion of auto-versioning, and we want to mesh with that. This
24 feature was added by the Rational guys, but isn't well-reviewed by
25 gstein. Apparently mod_dav defines a concept of whether resources are
26 auto-versionable, and then deals with the checkout/modify/checkin of
27 those resources. So *first* we need to understand the existing
28 system before we can do anything else, and figure out how mod_dav_svn
29 can act as a "provider" to that framework.
31 (Greg also warns: this autoversioning feature added by Rational was
32 done based on an OLD version of the deltaV RFC, so watch out for
33 mismatches with the final RFC 3253.)
35 [gstein sez: Note: the reason for the auto-versioning framework is to
36 take the load off of the provider for modeling WebDAV's auto-vsn
37 concepts to clients. mod_dav itself can deal with the property
38 management, sequence of operations, error responses, whatnot. That
39 said, it is also open to change and refinement -- there is no way that
40 it is set in stone. That only happens once an Open Source
41 implementation has used it.]
44 Phase 2 is more complicated:
46 * Greg proposed a system whereby the LOCK creates a txn, the PUTs
47 only write to the txn (the txn name is the lock "token"), and the
48 UNLOCK commits the txn. The problem with this is that DAV clients
49 expect real locking here, and this is just a "fake out":
51 - If client #1 LOCKS a file, then when client #2 does a GET,
52 they should see the latest version that client #1 has PUT, not
55 [gstein sez he doesn't believe that the GET sans locktoken has
56 to reflect the latest PUT-with-locktoken. I disagree. See
57 below for a response from the DeltaV IETF Working Group]
59 - Also, if client #2 tries to work on the file, its LOCK request
60 should be denied if it's already locked. Users will be mighty
61 pissed if they get a LOCK on the file, but when they finally
62 close MSWord, they get an out-of-date error!
64 [gstein sez this is only if we take an exclusive lock. shared
65 locks are more interesting. I say, yah, but so what. We only
66 care about write-locks anyway, which according to 2518, are
67 always exclusive, I think. shared-locks are just read-locks,
68 and can be done with unversioned props.]
70 * It seems that the Right Way to do this is to actually design and
71 implement some kind of locking system. We've had a huuuuge
72 discussion on the dev list about this, and folks like jimb and
73 kfogel want the system to be more of a "communication" system,
74 rather than a system for unconditionally handcuffing naughty
75 users. This goal doesn't necessarily contradict the needs of DAV
76 clients, however. Smart svn clients should be able to easily
77 override a LOCK failure, perhaps by using some special 'Force:
78 true' request header. Dumb DAV clients won't know about this
79 technique, so they effectively end up with the 'handcuff' locking
82 [brane sez: Exclusive and shared lcoks can both be used for
83 communication, and which one you use depends on context --
86 ----------------------------------------------------------------
88 I sent a mail off to the deltaV working group, asking about the
91 Geoff Clemm came back and said, "yah, if a lock-holder does a PUT to a
92 locked resource, then the changes should be immediately visible to
93 *all* users who do a GET, whether they hold the lock token or not."
95 This is my (sussman)'s intuition too, but it throws a big wrench into
96 gstein's proposal about how to do Phase 2.
98 [brane sez: Not really. All you have to do is maintain a list of the
99 public URLs of objects that were actually modified through a "locked"
100 PUT -- *not* the bubble-up dirs -- and you have to maintain that
101 anyway, if you want to implement exclusive locks. A GET will just
102 check that list first, and if it finds the URL, look into the
103 associated txn instead of HEAD.]
105 [ gstein: note that list is cross-txn; we probably want a new dbm in
106 the REPOS/dav/ subdir. map the repos path (derived from the URL) to
107 the txn-name containing the most recent copy.
109 my hope was to avoid additional state like this, and encode that
110 state in something like the locktoken. ]
112 ----------------------------------------------------------------
114 Here are some thoughts Bill Tutt and I shared on IRC some time
115 ago. They're more about locking than auto-versioning, but the two
116 concepts are related, so this brain dump might as well go in here.
118 <<<It's pretty late/early right now, so I'll just dump Bill's mail in
119 here for reference, and edit it later.>>>
121 From: "Bill Tutt" <billtut@microsoft.com>
122 To: "Branko Cibej" <brane@xbc.nu>
123 Subject: Locks Discussion
124 Date: Wed, 4 Sep 2002 15:49:54 -0700
127 <brane> "svn edit" has other uses, too
128 <brane> e.g., you could check out a wc that has only checksums, not text
129 bases, and makes wc files read-only. "svn edit" would make them
130 writable, and temporarily store the text base. it doesn't have to cerate
132 <brane> "svn edit" can be completely client-side.
134 It could, but ideally it would just work as if it were connected. i.e.
135 executing "svn note" if connected, and not if not. i.e. laptop on bus
138 <brane> basically, you're non-exclusive lock would add an unversioned
139 annotation to an object.
140 <brane> ok. so we have "svn lock", which is an exclusive lock
141 <brane> and "svn edit", which may or may not create locks
143 At a minimum annotates the file in the WC, for the "svn commit" default
144 log message case below. At the far out end, it would create an exclusive
145 lock if the file (via the pluggable diff protocol) was determined to be
148 <brane> and "svn note", which just adds a note to the object
149 <brane> and "svn lock" can also add a note to the object
150 <brane> and "svn unlock" takes the note away
151 <brane> and "svn rmnote" takes the note away, too
152 <brane> and "svn commit" clears locks and removes notes
153 <brane> and "svn commit" uses the note (if any, keyed off the username)
154 as the default log message
155 <brane> "svn note" and "svn rmnote", always contacts the server
157 "svn revert" now becomes "svn revert" + "svn rmnote" all rolled into
159 "svn rmnote" undos (as appropriate) any annotation on a WC entry. If
160 created via "svn note" functionality, then the server is contacted. If
161 via "svn edit" disconnected client functionality, then the server is NOT
164 I've edited out my original comments, and inserted my own post log
169 Do you want a dangerous fugitive staying in your flat?
171 Well, don't upset him and he'll be a nice fugitive staying in your flat.
175 -----------------------------------------------
179 * ? options response includes autoversioning feature... required?
181 * all resources gain new live property: 'DAV:auto-version'. This
182 property will always be set to 'DAV:checkout-checkin'. (There are
183 four possible values, and this is the one that has nothing
184 whatsoever to do with locking.)
186 * use-case 1: PUT or PROPPATCH against existing VCR, or a PUT of a
189 * use-case 2: DELETE of VCR
191 * use-case 3: MKCOL (totally new, by definition)
194 -----------------------------------------------------------
196 Analysis of dav_svn_put()
197 =========================
199 At the moment, ra_dav is only attempting to PUT WR's.
201 mod_dav, however, already has an autoversioning infrastructure, and it
202 currently attempts to bookend the stream-writing with an auto-checkout
203 and auto-checkin. But mod_dav_svn doesn't support those operations
204 yet, so they're just no-ops.
206 By supporting auto_checkout and auto_checkin, we're adding the magic
207 ability for a PUT on a VCR to happen: the VCR is magically transformed
208 'in place' into a WR, and then back again.
212 * tries to checkout parent resource if deemed necessary, i.e. the
213 resource doesn't exist, or if explicit parent checkout was
216 - vsn_hooks->auto_versionable()
218 We should *always* return DAV_AUTO_VERSION_ALWAYS for now.
219 The other values require that locks exist or not, and we're
220 not supporting any kind of locks yet.
222 - vsn_hooks->checkout(parent, 1 /*auto-checkout*/...)
224 So we need to allow an auto-checkout of a parent VCR.
225 See checkout() discussion below.
227 * if the resource doesn't exist, then create the resource:
229 - vsn_hooks->vsn_control(resource, NULL).
231 We need to implement this from scratch. For now, we only
232 allow a NULL target, which means, 'create an empty file'. The
233 resource itself must be tweaked in-place into a true VCR.
235 * if the resource exists but isn't a WR, check it out:
237 - vsn_hooks->checkout(resource, 1 /*auto-checkout*/...)
239 This routine currently takes a VR and an activity, and returns
242 Here's what we need to make happen if we get 'auto-checkout'
245 - verify we have a VCR, and get the VCR's VR.
246 - create a new activity (txn)
247 - checkout the VR into the activity, creating a WR.
248 - don't return the WR via pointer, but instead tweak the
249 VCR to look like the WR (think about how to do this.)
250 [ gstein: the docco for checkout() states you're allowed
251 to tweak the passed-in resource; that is why it is
255 dav_svn_put() then attempts to push data into the WR's stream, no prob.
260 * if something went wrong when PUTting data into the resource's
261 stream, then this function attempts to either
263 - vsn_hooks->uncheckout() [if a resource or parent was checked out]
265 I guess we would abort the svn txn and magically change the WR back
266 into the VCR? (think about how to do this.)
268 [ gstein: the dav_resource is non-const; just change it. we
269 aren't talking a stateful change, just altering a runtime
272 - vsn_hooks->remove_resource() [if a new resource was created]
274 No prob. This just calls svn_fs_delete_tree() on the newly
277 * otherwise, in normal case, if resource was checked out:
279 - vsn_hooks->checkin(resource)
281 Need to write this routine! It would commit the txn hidden
282 within the WR, using an auto-generated log message.
283 Furthermore, it needs to possibly return the new VR that was
284 created, and convert the WR resource back into a VCR that
285 points to the new VR.
287 (Do our VCR's point to VR's right now?
289 [ gstein: VCRs never "point"; semantically, they just get
290 updated with properties and content to match a VR. ]
292 just implicitly through the checked-in property, right?)
294 * then, if parent was checked out too,
296 - vsn_hooks->checkin(parent)
298 Oops, this is a problem. it's very likely that we just
299 committed the txn in the previous call to checkin(). the best
300 strategy here, I suppose, is to not throw an error... i.e. if
301 the txn no longer exists, just do nothing. (cmpilato isn't
302 sure what happens if you try to open_txn() on a txn that is
305 [ gstein: mod_dav should auto-checkin a set of resources rather
306 than one at a time. the provider can then do it atomically,
307 or one at a time, as they see fit ]
310 [ gstein: note that we're more than likely going to need to update the
312 mod_dav provider APIs. I think the answer is to add a binary API
313 version to the new ap_provider() interface, to publish a mod_dav
314 provider (binary) API version, and to state that the old provider
315 registration function now throws an error (by definition, modules
316 using it would be obsolete). as we rev the API, we just bump the
317 published mod_dav API version.
319 one problem here is that the current httpd release strategy might
320 get in our way; I need to review some of the recent decisions to see
321 how that affects us from an ongoing "httpd needs some fixes for svn"
326 -----------------------------------------------------------
330 We're working on a real locking system now. Eventually, we'll be
331 able to use this feature to complete autoversioning ("phase 2"
334 - remember that we'll need to be able to look up a lock in the
335 lock-table by UUID. Generic DAV clients use UUID URIs to talk
338 - MSWord locks a document with a timeout of 180 seconds, then
339 continuously re-LOCKs every so often, passing the existing
340 lock-token back in an If: header. mod_dav_fs returns the same
341 lock-token UUID (presumably with a newer expiration time). Our
342 current implementation doesn't allow for mutable lock tokens. We
343 need to make sure that this doesn't mess up MSWord... that it's
344 usin the *last* token to renew locks, not the first one.