notes/webdav-general-summary

   1
   2 Ben's Quick Summary of WebDAV and DeltaV
   3 =========================================
   4
   5 * WebDAV: RFC 2518.  Extends the standard HTTP methods to make web
   6   servers behave as traditional fileservers, complete with a locking
   7   model and meta-data properties.
   8
   9 * DeltaV: RFC 3253.  Adds more HTTP methods to WebDAV, introducing
  10   versioning concepts.  Provides a number of flexible versioning
  11   models that servers can support, and some backwards-compatibility
  12   modes for older WebDAV or HTTP/1.1 clients.
  13
  14
  15 ----------------------------------------
  16
  17 WebDAV
  18 ======
  19
  20 Key concepts introduced:  properties, collections, locking.
  21
  22 New HTTP client request headers:  {Depth, Destination, If, ...}
  23 New HTTP server response headers: {DAV, ...}
  24
  25
  26 * Property:    a meta-data name/value.  every property exists in
  27                some unique "namespace", defined using xml namespaces.
  28
  29   - a "live" property is one that is controlled by the server, like a
  30     file's content-length, for example, or a file's
  31     checked-in/checked-out state.  often the property is read-only; if
  32     not, the server enforces the propval's syntax/semantics.
  33
  34   - a "dead" property is one that is invented and controlled by a
  35     user, just like file contents.
  36
  37   - new HTTP methods:  PROPFIND, PROPPATCH to change propval.
  38
  39
  40 * collection:  a directory.  contains a bunch of URIs and has props.
  41
  42   - each child is called a 'member' URI.  each internal member URI
  43     must be relative to parent collection.
  44
  45   - collection URIs are supposed to end with trailing slashes.
  46     servers should auto-append them if not present.
  47
  48   - new HTTP method:  MKCOL to create collection.
  49
  50
  51 * locking:  a way of serializing access to a resource.
  52
  53   - locking is totally optional -- the only 'flexible' part of the
  54     WebDAV spec.  a WebDAV server may support locking to any degree:
  55     either not at all, or some combination of exclusive or shared
  56     locks.  An OPTIONS response can return a header of DAV: 1 or DAV:
  57     2.  Level-2 support means locking is available.
  58
  59   - new HTTP method: LOCK.  creates a lock and attaches it to the
  60     resource.  the server returns a 'lock token' to the client, which
  61     is defined to be any universally unique URI.  the 'lock' attached
  62     to the resource has these properties:
  63
  64       * owner:   some authenticated username
  65       * token:   the specific lock identifier
  66       * scope:   either "exclusive" or "shared"
  67       * type:    "write".  [other types may exist someday]
  68       * depth:   for a collection, either 0 or infinity.
  69       * timeout: some value in seconds
  70
  71        - exclusive locks behave how you think -- only one per resource
  72          allowed.  shared locks, on the other hand, are just for
  73          communication -- any number of them can be attached.
  74
  75        - lock tokens are *not* secret: anyone can query the
  76          "DAV:lockdiscovery" property to see all the locks attached to
  77          a resource, which includes detailed descriptions of every
  78          field above.
  79
  80        - to remove a lock with UNLOCK, or to modify something with an
  81          exclusive lock, the client must provide *two* things:
  82
  83             1. authentication/authorization.  prove you own and/or are
  84                allowed to mess with the lock.  this happens via
  85                existing HTTP methods.
  86
  87             2. the lock token, i.e. the "name" of the lock.  (this
  88                requirement also prevents some non-DAV aware program
  89                from using your auth credentials and accidentally doing
  90                an ignorant PUT.  think of it as credentials for your
  91                client software!)
  92
  93        - 'DAV:supportedlock' live property: indicates what kinds of
  94           locking is allowed on a resource.
  95
  96        - the rfc defines an 'opaquelocktoken' scheme that all dav
  97          servers must know how to understand: clients may generate and
  98          post them in an If: header.
  99
 100        - a collection can have a lock of either Depth 0 or Infinity.
 101          a lock on a collection prevents adding/removing member URIs.
 102          if a lock-holder adds something to a deeply locked
 103          collection, then the newly added member becomes part of the
 104          same write lock.
 105
 106        - a 'null resource' (which normally returns 404) can be locked,
 107          in order to reserve a name.  see section 7.4.
 108
 109
 110 * other methods added by WebDAV:
 111
 112    - COPY:  - copies resource to Destination: header.
 113             - optional "Overwrite: [T | F]" header defaults to T.
 114             - for collections, either Depth: [0 | infinity] allowed.
 115             - client can specify how to behave when copying props.
 116
 117    - MOVE   - defined to be COPY + DELETE, but an atomic operation.
 118
 119
 120 -------------------------------------------------------------------------
 121
 122 DeltaV
 123 ======
 124
 125 Models
 126 ======
 127
 128 A DeltaV server can support two different ways of working: server-side
 129 working copies, and client-side working copies.  These systems aren't
 130 mutually exclusive at all.  An OPTIONS request reveals which systems
 131 the server supports.
 132
 133
 134 The General Concepts
 135 ====================
 136
 137 If you understand this, everything will become really clear.  These
 138 are the fundamentals.
 139
 140 DeltaV allows you version any kind of resource -- a file, a
 141 collection, whatever.
 142
 143  * If you take a resource on a server and put it under version control
 144    (using the VERSION-CONTROL method), a "Version Controlled
 145    Resource", or VCR, is created.  A VCR is a special thing: it's a
 146    unique, permanent URL used to talk about an entity under version
 147    control, no matter how many times it changes.
 148
 149  * Every time you change a VCR (discussed below), a new "Version
 150    Resource" is created, or VR.  The VR is also a unique, permanent
 151    URL, representing some immutable object on the server; it
 152    represents the contents and (dead) properties of the VCR at one
 153    particular moment in time.
 154
 155  * At any given time, a VCR has a "pointer" to some particular VR of
 156    itself.  The pointer is just a property, called "DAV:checked-in".
 157    By definition, the contents of the VCR are always equal to the
 158    contents of the VR it points to.  If you change the pointer to a
 159    different VR, the VCR's contents magically change to match.
 160
 161  * All of a VCR's VR objects need to be organized somehow.  And in
 162    fact, they *are* organized into a little tree of predecessors and
 163    successors.  It turns out that every VCR has a "history" resource
 164    sitting in the background.  (The history resource may or may not be
 165    directly accessible, depending on whether the server supports the
 166    'Version History' feature.)  Regardless, a VCR's history resource
 167    is a container that contains all of the VRs, organized into a
 168    tree.  You might think of a history resource like an RCS
 169    file... except that the history is allowed to contain 'forks',
 170    i.e. a VR in the history might have multiple predecessors or
 171    successors.  Also, each VR in a history can have a human-readable
 172    "label" attached to it, so it's easier to talk about which VR you
 173    want.
 174
 175
 176 Changing a VCR
 177 ==============
 178
 179 So, how do you make a change to VCR, then?  It all depends on what
 180 deltaV features the server supports.
 181
 182  * If the user is using the server-side working-copy model:
 183
 184      - The client creates something called a 'workspace', using
 185        MKWORKSPACE.
 186
 187      - CHECKOUT a VCR into the workspace.  The VCR's 'DAV:checked-in'
 188        property suddenly becomes a 'DAV:checked-out' property... but
 189        it still points to the same VR.
 190
 191      - Use PUT and PROPATCH to change the contents or dead props of
 192        the VCR.  If you want to revert everything, just UNCHECKOUT.
 193
 194      - CHECKIN the VCR.  A new VR is created in the VCR's history, and
 195        the 'DAV:checked-out' property becomes a 'DAV:checked-in'
 196        property, pointing to the new VR.
 197
 198  * If the user is using the client-side working-copy model:
 199
 200      - The client creates something called an 'activity', using
 201        MKACTIVITY.
 202
 203      - CHECKOUT a VR into the activity.  This creates a temporary
 204        'working resource' (WR) in the activity.  The VCR's
 205        'DAV:checked-in' property suddenly becomes a 'DAV:checked-out'
 206        property... but it still points to the same VR.  The WR has a
 207        'DAV:checked-out' property that points to VR as well.
 208
 209      - Use PUT and PROPATCH to change the contents or dead props of
 210        the WR.  If you want to revert everything, just UNCHECKOUT.
 211
 212      - CHECKIN the WR.  A new VR is created in the VCR's history, and
 213        the VCR's 'DAV:checked-in' property points to it.  And
 214        normally, the temporary WR is deleted.
 215
 216 See?  Not such a big deal.  Ahem.
 217
 218
 219 Auto-Versioning
 220 ===============
 221
 222 What if some regular WebDAV client tries to use a deltaV server?  Or
 223 an even dumber HTTP 1.1 client?
 224
 225 If the server supports the 'auto-versioning' feature, then all
 226 resources gain a new live property called 'DAV:auto-version'.  The
 227 value of this property indicates how the server should behave when a
 228 non-deltaV client does an ignorant PUT or PROPPATCH on a resource.  I
 229 won't go into detail, but there are many possible behaviors:
 230
 231   * do an implicit (auto-) CHECKOUT and CHECKIN.
 232   * auto-CHECKOUT, and wait for a lock to vanish before auto-CHECKIN.
 233   * same as above, but if not locked, wait for an explicit CHECKIN.
 234   * require a lock.  LOCK causes auto-CHECKOUT, UNLOCK causes auto-CHECKIN.
 235
 236
 237
 238 Basic Features
 239 ==============
 240
 241 DeltaV has a bunch of "basic features", and a bunch of "advanced
 242 features".  Here are the basic features, in a nutshell.
 243
 244
 245 * Version Control feature
 246
 247     * new VERSION-CONTROL method to create a VCR.
 248
 249     * resources gain a whole bunch of new live props (not all listed
 250       here), such some of which include DAV:checked-[in|out],
 251       DAV:auto-version, DAV:comment, the author.  VRs have properties
 252       that describe lists of successor and predecessor VRs.
 253
 254     * new REPORT method.  two 'standard' reports are defined, but
 255       custom reports can be created.
 256
 257
 258 * Checkout-in-place feature
 259
 260     * new CHECKOUT, CHECKIN, UNCHECKOUT methods, which are able to
 261       modify VCRs in-place.
 262
 263
 264 * Version History feature
 265
 266     * version histories become tangible URLs.  introduce new dav
 267       resourcetype called 'DAV:version-history'.
 268
 269     * all VCRs and VR's gain a 'DAV:version-history' prop that points
 270       to their history resource.
 271
 272     * a version-history has a 'DAV:version-set' property that lists
 273       all VRs it contains, and a 'DAV:root-version' that points to the
 274       very first VR in the history.
 275
 276     * a special REPORT allows one to convert a version-history URL
 277       into the VCR it represents.  (i.e. reverse-lookup.)
 278
 279
 280 * Workspace feature
 281
 282     * MKWORKSPACE creates a server-side working area.  an OPTIONS
 283       request can tell you where the client is allowed to do this.
 284
 285     * the workspace resource has a property that lists all the
 286       resources it contains.  regular resources have a property
 287       indicating what workspace they're in.
 288
 289     * The workspace can hold unversioned items put there by PUT & MKCOL.
 290       It can hold VCRs via CHECKOUT.
 291
 292     * Special:  the VERSION-CONTROL method can create a *new* VCR from
 293       a history.  If two people both CHECKIN VCRs created from the
 294       same history resource, then poof... the history develops forks!
 295
 296
 297 * Update feature
 298
 299     * UPDATE method is able to tweak a VCR to "point" to a new VR.
 300       Very simple!
 301
 302
 303 * Label feature
 304
 305     * LABEL method allows you to attach a human-readable name to a
 306       particular VR.
 307
 308     * Each VR can have many names.  They're listed in a
 309       'DAV:label-name-set' property.
 310
 311     * New http request header, "Label:", can be used to target
 312       a specific VR of a VCR.  This works when doing a GET of a VCR.
 313       It also works as you think on COPY, CHECKOUT, UDPATE, etc.
 314
 315
 316 * Working Resource feature
 317
 318     * This feature essentially allows client-side working copies to
 319       synchronize their data with the server.
 320
 321     * all VRs gain two properties that control whether or not
 322       histories can (or should) contain forks.
 323
 324     * a CHECKOUT of a VR creates a temporary 'working resource' (WR),
 325       which can then be modified.  When the WR is checked in, a new VR
 326       is created as usual, the WR vanishes, and the VCR is updated to
 327       point to the VR as usual.
 328
 329     * note that this technique is an alternative to the
 330       Checkout-in-place feature, whereby VCRs are directly checked out
 331       and modified.
 332
 333
 334
 335 Advanced Features
 336 =================
 337
 338 The advanced features of deltaV introduce a bunch of new concepts.
 339 Here are the fundamentals.
 340
 341 [Whenever I say, "points to", I'm talking about some object leading to
 342 another object via a specific property.]
 343
 344 * A "configuration" is a set of VCRs.  In particular, it contains a
 345   "root collection" which organizes the VCRs in some way.
 346
 347   Note that this is _different_ than a versioned collection.  The main
 348   difference is that a collection is a *single* resource which
 349   contains dead-props and some directory-entries; its VRs just capture
 350   various states of the props and dirents.  But it's just ONE
 351   resource.  A configuration, however, is a SET of VCRs.  The VCRs may
 352   not necessarily be related to each other, either.  A configuration
 353   is a flexible thing -- its VCRs can be tweaked to point to
 354   different VRs, however you want, with no versioning happening in the
 355   background.  A collection, on the other hand, has a static set of
 356   dirents; to change them, you have to do a CHECKOUT, CHECKIN, which
 357   results in a new, static collection VR.
 358
 359 * A "baseline" is a special kind of resource which remembers this
 360   state of a configuration... it knows exactly which VR each VCR in
 361   the configuration should point to.  Just like a VR is a 'snapshot'
 362   of a VCR, a baseline is a 'snapshot' of the configuration.  And just
 363   like a VR, a baseline can have a human label too.
 364
 365 * Another kind of resource is a "version controlled configuration", or
 366   VCC.  This resource floats out in space;  its sole purpose is to
 367   magically connect a configuration to a baseline.   Specifically,
 368   each VCR in the configuration points to the VCC, and the VCC points
 369   to a baseline.
 370
 371   And here's the usual magic: if you make the VCC point to a different
 372   baseline, then poof, the whole configuration suddenly switches to
 373   the baseline.  (That is, all of the configuration's VCRs suddenly
 374   point to the specific VRs of the baseline.)
 375
 376 * Finally, it's worth mentioning that a baseline resource points to a
 377   "baseline collection" resource.  This collection is a tree made up
 378   of the VRs in the baseline, easily browseable.  You can think of it
 379   as a "what-if" sort of preview -- i.e. "what would the configuration
 380   look like if I made its VCC point to this baseline?"  It also means
 381   people can view a baseline in action, *without* having to tweak a
 382   VCC, which might require write access of some kind.
 383
 384
 385 Got all that?  Good.  Make some pictures.  :-)
 386
 387
 388 How to create new baselines
 389 ===========================
 390
 391 The "in-place" method:
 392
 393    Get this.  A VCC is really just a special kind of VCR!  But it's a
 394    VCR which represents the *whole state* of a configuration.  Just
 395    like a normal VCR, the VCC's "DAV:checked-in" property points to a
 396    baseline, which just a special kind of VR.
 397
 398    That means you can do a CHECKOUT of the VCC in-place... then tweak
 399    the configuration to point to a new set of VR's... then CHECKIN the
 400    VCC.  Poof, a new baseline is created which captures your new
 401    configuration state.  And the VCC now points to that new baseline.
 402
 403 The "working resource" method:
 404
 405    Okay, so a baseline is a special kind of VR.  Fine, so we do a
 406    CHECKOUT of it, and get a "working baseline", which a special kind
 407    of WR.
 408
 409    Now, assuming you're using this method all around, you checkout the
 410    configuration's various VRs as WRs, modify the WRs, and check them
 411    back in to create new VRs.  Finally, you CHECKIN the working
 412    baseline, which creates a new baseline that captures the state of
 413    the configuration.  (The working baseline isn't something you tweak
 414    directly;  it's more like a token used at CHECKIN time.)
 415
 416
 417 How Merging Works... at least for SVN.
 418 =================
 419
 420 The deltaV MERGE command is very fancy.  It tracks merge ancestors in
 421 properties, and sets flags for clients to manually resolve conflicts
 422 on the server.
 423
 424 Subversion uses MERGE in a simpler way:
 425
 426   1. We checkout a bunch of VRs into an activity, and patch them as a
 427      bunch of WRs.
 428
 429   2. We checkout a "working baseline" into the activity, from whatever
 430      baseline represents the HEAD svn revision.
 431
 432   3. We issue a MERGE request with the activity as the source.
 433
 434      By definition, this causes the whole activity to be
 435      auto-checked-in.  First each WR in the activity is checked-in,
 436      causing the configuration to morph.  Then the working-baseline in
 437      the activity is checked-in, which creates a new baseline that
 438      captures the configuration state.
 439
 440 Of course, mod_dav_svn doesn't actually do all the checkin stuff;  but
 441 what's important is that the *result* of the MERGE is exactly as IF
 442 all this stuff had happened.  And that's all that matters.
 443
 444
 445
 446