1 This file describes the design, layouts, and file formats of a
2 libsvn_fs_fs repository.
7 In FSFS, each committed revision is represented as an immutable file
8 containing the new node-revisions, contents, and changed-path
9 information for the revision, plus a second, changeable file
10 containing the revision properties.
12 In contrast to the BDB back end, the contents of recent revision of
13 files are stored as deltas against earlier revisions, instead of the
14 other way around. This is less efficient for common-case checkouts,
15 but brings greater simplicity and robustness, as well as the
16 flexibility to make commits work without write access to existing
17 revisions. Skip-deltas and delta combination mitigate the checkout
20 In-progress transactions are represented with a prototype rev file
21 containing only the new text representations of files (appended to as
22 changed file contents come in), along with a separate file for each
23 node-revision, directory representation, or property representation
24 which has been changed or added in the transaction. During the final
25 stage of the commit, these separate files are marshalled onto the end
26 of the prototype rev file to form the immutable revision file.
28 Layout of the FS directory
29 --------------------------
31 The layout of the FS directory (the "db" subdirectory of the
34 revs/ Subdirectory containing revs
35 <shard>/ Shard directory, if sharding is in use (see below)
36 <revnum> File containing rev <revnum>
37 revprops/ Subdirectory containing rev-props
38 <shard>/ Shard directory, if sharding is in use (see below)
39 <revnum> File containing rev-props for <revnum>
40 transactions/ Subdirectory containing transactions
41 <txnid>.txn/ Directory containing transaction <txnid>
42 txn-protorevs/ Subdirectory containing transaction proto-revision files
43 <txnid>.rev Proto-revision file for transaction <txnid>
44 <txnid>.rev-lock Write lock for proto-rev file
45 txn-current File containing the next transaction key
46 locks/ Subdirectory containing locks
47 <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest
48 <digest> File containing locks/children for path with <digest>
49 node-origins/ Lazy cache of origin noderevs for nodes
50 <partial-nodeid> File containing noderev ID of origins of nodes
51 current File specifying current revision and next node/copy id
52 fs-type File identifying this filesystem as an FSFS filesystem
53 write-lock Empty file, locked to serialise writers
54 txn-current-lock Empty file, locked to serialise 'txn-current'
55 uuid File containing the UUID of the repository
56 format File containing the format number of this filesystem
58 Files in the revprops directory are in the hash dump format used by
61 The format of the "current" file is:
63 * Format 3 and above: a single line of the form
64 "<youngest-revision>\n" giving the youngest revision for the
67 * Format 2 and below: a single line of the form "<youngest-revision>
68 <next-node-id> <next-copy-id>\n" giving the youngest revision, the
69 next unique node-ID, and the next unique copy-ID for the
72 The "write-lock" file is an empty file which is locked before the
73 final stage of a commit and unlocked after the new "current" file has
74 been moved into place to indicate that a new revision is present. It
75 is also locked during a revprop propchange while the revprop file is
76 read in, mutated, and written out again. Note that readers are never
77 blocked by any operation - writers must ensure that the filesystem is
78 always in a consistent state.
80 The "txn-current" file is a file with a single line of text that
81 contains only a base-36 number. The current value will be used in the
82 next transaction name, along with the revision number the transaction
83 is based on. This sequence number ensures that transaction names are
84 not reused, even if the transaction is aborted and a new transaction
85 based on the same revision is begun. The only operation that FSFS
86 performs on this file is "get and increment"; the "txn-current-lock"
87 file is locked during this operation.
92 The "format" file defines what features are permitted within the
93 filesystem, and indicates changes that are not backward-compatible.
94 It serves the same purpose as the repository file of the same name.
96 The filesystem format file was introduced in Subversion 1.2, and so
97 will not be present if the repository was created with an older
98 version of Subversion. An absent format file should be interpreted as
99 indicating a format 1 filesystem.
101 The format file is a single line of the form "<format number>\n",
102 followed by any number of lines specifying 'format options' -
103 additional information about the filesystem's format. Each format
104 option line is of the form "<option>\n" or "<option> <parameters>\n".
106 Clients should raise an error if they encounter an option not
107 permitted by the format number in use.
111 Format 1, understood by Subversion 1.1+
112 Format 2, understood by Subversion 1.4+
113 Format 3, understood by Subversion 1.5+
115 The differences between the formats are:
117 Delta representation in revision files
118 Format 1: svndiff0 only
119 Formats 2-3: svndiff0 or svndiff1
122 Formats 1-2: none permitted
123 Format 3: "layout" option
125 Transaction name reuse
126 Formats 1-2: transaction names may be reused
127 Format 3: transaction names generated using txn-current file
129 Location of proto-rev file and its lock
130 Formats 1-2: transactions/<txnid>/rev and
131 transactions/<txnid>/rev-lock.
132 Format 3: txn-protorevs/<txnid>.rev and
133 txn-protorevs/<txnid>.rev-lock.
135 Node-ID and copy-ID generation
136 Formats 1-2: Node-IDs and copy-IDs are guaranteed to form a
137 monotonically increasing base36 sequence using the "current"
139 Format 3: Node-IDs and copy-IDs use the new revision number to
140 ensure uniqueness and the "current" file just contains the
144 Format 1-2: minfo-here and minfo-count node-revision fields are not
145 stored. svn_fs_get_mergeinfo returns an error.
146 Format 3: minfo-here and minfo-count node-revision fields are
147 maintained. svn_fs_get_mergeinfo works.
150 Filesystem format options
151 -------------------------
153 Currently, the only recognised format option is "layout", which
154 specifies the paths that will be used to store the revision files and
155 revision property files.
157 The "layout" option is followed by the name of the filesystem layout
158 and any required parameters. The default layout, if no "layout"
159 keyword is specified, is the 'linear' layout.
161 The known layouts, and the parameters they require, are as follows:
164 Revision files and rev-prop files are named after the revision they
165 represent, and are placed directly in the revs/ and revprops/
166 directories. r1234 will be represented by the revision file
167 revs/1234 and the rev-prop file revprops/1234.
169 "sharded <max-files-per-directory>"
170 Revision files and rev-prop files are named after the revision they
171 represent, and are placed in a subdirectory of the revs/ and
172 revprops/ directories named according to the 'shard' they belong to.
174 Shards are numbered from zero and contain between one and the
175 maximum number of files per directory specified in the layout's
178 For the "sharded 1000" layout, r1234 will be represented by the
179 revision file revs/1/1234 and rev-prop file revprops/1/1234. The
180 revs/0/ directory will contain revisions 0-999, revs/1/ will contain
181 1000-1999, and so on.
186 In order to support efficient lookup of node-revisions by their IDs
187 and to simplify the allocation of fresh node-IDs during a transaction,
188 we treat the fields of a node-ID in new and interesting ways.
190 Within a revision file, node-revs have a txn-id field of the form
191 "r<rev>/<offset>", to support easy lookup. New node-revision IDs
192 assigned within a transaction have the txn-id field of "t<txnid>".
194 When a new node-id or copy-id is assigned in a transaction, the ID
195 used is a "_" followed by a base36 number unique to the transaction.
196 During the final phase of a comment, node-revision IDs are rewritten
197 to have repository-wide unique node-ID and copy-ID fields, and to have
198 "r<rev>/<offset>" txn-id fields.
200 In Format 3 and above, this uniqueness is done by changing a temporary
201 id of "_<base36>" to "<rev>-<base36>". Note that this means that the
202 originating revision of a line of history or a copy can be determined
203 by looking at the node ID.
205 In Format 2 and below, the "current" file contains global base36
206 node-ID and copy-ID counters; during the commit, the counter value is
207 added to the transaction-specific base36 ID, and the value in
208 "current" is adjusted.
210 (It is legal for Format 3 repositories to contain Format 2-style IDs;
211 this just prevents I/O-less node-origin-rev lookup for those nodes.)
213 The temporary assignment of node-ID and copy-ID fields has
214 implications for svn_fs_compare_ids and svn_fs_check_related. The IDs
215 _1.0.t1 is not related to the ID _1.0.t2 even though they have the
216 same node-ID, because temporary node-IDs are restricted in scope to
217 the transactions they belong to.
219 There is a lazily created cache mapping from node-IDs to the full
220 node-revision ID where they are created. This is in the node-origins
221 directory; the file name is the node-ID without its last character (or
222 "0" for single-character node IDs) and the contents is a serialized
223 hash mapping from node-ID to node-revision ID. This cache is only
224 used for node-IDs of the pre-Format 3 style.
226 Copy-IDs and copy roots
227 -----------------------
229 Copy-IDs are assigned in the same manner as they are in the BDB
232 * A node-rev resulting from a creation operation (with no copy
233 history) receives the copy-ID of its parent directory.
235 * A node-rev resulting from a copy operation receives a fresh
236 copy-ID, as one would expect.
238 * A node-rev resulting from a modification operation receives a
239 copy-ID depending on whether its predecessor derives from a
240 copy operation or whether it derives from a creation operation
241 with no intervening copies:
243 - If the predecessor does not derive from a copy, the new
244 node-rev receives the copy-ID of its parent directory. If the
245 node-rev is being modified through its created-path, this will
246 be the same copy-ID as the predecessor node-rev has; however,
247 if the node-rev is being modified through a copied ancestor
248 directory (i.e. we are performing a "lazy copy"), this will be
251 - If the predecessor derives from a copy and the node-rev is
252 being modified through its created-path, the new node-rev
253 receives the copy-ID of the predecessor.
255 - If the predecessor derives from a copy and the node-rev is not
256 being modified through its created path, the new node-rev
257 receives a fresh copy-ID. This is called a "soft copy"
258 operation, as distinct from a "true copy" operation which was
259 actually requested through the svn_fs interface. Soft copies
260 exist to ensure that the same <node-ID,copy-ID> pair is not
261 used twice within a transaction.
263 Unlike the BDB implementation, we do not have a "copies" table.
264 Instead, each node-revision record contains a "copyroot" field
265 identifying the node-rev resulting from the true copy operation most
266 proximal to the node-rev. If the node-rev does not itself derive from
267 a copy operation, then the copyroot field identifies the copy of an
268 ancestor directory; if no ancestor directories derive from a copy
269 operation, then the copyroot field identifies the root directory of
275 A revision file contains a concatenation of various kinds of data:
277 * Text and property representations
279 * The changed-path data
280 * Two offsets at the very end
282 A representation begins with a line containing either "PLAIN\n" or
283 "DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>,
284 and <length> give the location of the delta base of the representation
285 and the amount of data it contains (not counting the header or
286 trailer). If no base location is given for a delta, the base is the
287 empty stream. After the initial line comes raw svndiff data, followed
288 by a cosmetic trailer "ENDREP\n".
290 If the a representation is for the text contents of a directory node,
291 the expanded contents are in hash dump format mapping entry names to
292 "<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives
293 the ID of the child node-rev.
295 If a representation is for a property list, the expanded contents are
296 in the form of a dumped hash map mapping property names to property
299 The marshalling syntax for node-revs is a series of fields terminated
300 by a blank line. Fields have the syntax "<name>: <value>\n", where
301 <name> is a symbolic field name (each symbolic name is used only once
302 in a given node-rev) and <value> is the value data. Unrecognized
303 fields are ignored, for extensibility. The following fields are
306 id The ID of the node-rev
308 pred The ID of the predecessor node-rev
309 count Count of node-revs since the base of the node
310 text "<rev> <offset> <length> <size> <digest>" for text rep
311 props "<rev> <offset> <length> <size> <digest>" for props rep
312 <rev> and <offset> give location of rep
313 <length> gives length of rep, sans header and trailer
314 <size> gives size of expanded rep
315 <digest> gives hex MD5 digest of expanded rep
316 cpath FS pathname node was created at
317 copyfrom "<rev> <path>" of copyfrom data
318 copyroot "<rev> <created-path>" of the root of this copy
319 minfo-cnt The number of nodes under (and including) this node
320 which have svn:mergeinfo.
321 minfo-here Exists if if this node itself has svn:mergeinfo.
323 The predecessor of a node-rev crosses both soft and true copies;
324 together with the count field, it allows efficient determination of
325 the base for skip-deltas. The first node-rev of a node contains no
326 "pred" field. A node-revision with no properties may omit the "props"
327 field. A node-revision with no contents (a zero-length file or an
328 empty directory) may omit the "text" field. In a node-revision
329 resulting from a true copy operation, the "copyfrom" field gives the
330 copyfrom data. The "copyroot" field identifies the root node-revision
331 of the copy; it may be omitted if the node-rev is its own copy root
332 (as is the case for node-revs with copy history, and for the root node
333 of revision 0). Copy roots are identified by revision and
334 created-path, not by node-rev ID, because a copy root may be a
335 node-rev which exists later on within the same revision file, meaning
336 its offset is not yet known.
338 The changed-path data is represented as a series of changed-path
339 items, each consisting of two lines. The first line has the format
340 "<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the
341 node-rev ID of the new node-rev, <action> is "add", "delete",
342 "replace", or "modify", <text-mod> and <prop-mod> are "true" or
343 "false" indicating whether the text and/or properties changed, and
344 <path> is the changed pathname. For deletes, <id> is the node-rev ID
345 of the deleted node-rev, and <text-mod> and <prop-mod> are always
346 "false". The second line has the format "<rev> <path>\n" containing
347 the node-rev's copyfrom information if it has any; if it does not, the
348 second line is blank.
350 At the very end of a rev file is a pair of lines containing
351 "\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of
352 the root directory node revision and <cp-offset> is the offset of the
355 All numbers in the rev file format are unsigned and are represented as
361 A transaction directory has the following layout:
363 props Transaction props
364 next-ids Next temporary node-ID and copy-ID
365 changes Changed-path information so far
366 node.<nid>.<cid> New node-rev data for node
367 node.<nid>.<cid>.props Props for new node-rev, if changed
368 node.<nid>.<cid>.children Directory contents for node-rev
370 In FS formats 1 and 2, it also contains:
372 rev Prototype rev file with new text reps
373 rev-lock Lockfile for writing to the above
375 In newer formats, these files are in the txn-protorevs/ directory.
377 The prototype rev file is used to store the text representations as
378 they are received from the client. To ensure that only one client is
379 writing to the file at a given time, the "rev-lock" file is locked for
380 the duration of each write.
382 The two kinds of props files are all in hash dump format. The "props"
383 file will always be present. The "node.<nid>.<cid>.props" file will
384 only be present if the node-rev properties have been changed.
386 The "next-ids" file contains a single line "<next-temp-node-id>
387 <next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID
388 assignments (without the leading underscores).
390 The "children" file for a node-revision begins with a copy of the hash
391 dump representation of the directory entries from the old node-rev (or
392 a dump of the empty hash for new directories), and then an incremental
393 hash dump entry for each change made to the directory.
395 The "changes" file contains changed-path entries in the same form as
396 the changed-path entries in a rev file, except that <id> and <action>
397 may both be "reset" (in which case <text-mod> and <prop-mod> are both
398 always "false") to indicate that all changes to a path should be
399 considered undone. Reset entries are only used during the final merge
400 phase of a transaction.
402 The node-rev files have the same format as node-revs in a revision
403 file, except that the "text" and "props" fields are augmented as
406 * The "props" field may have the value "-1" if properties have
407 been changed and are contained in a "props" file within the
408 node-rev subdirectory.
410 * For directory node-revs, the "text" field may have the value
411 "-1" if entries have been changed and are contained in a
412 "contents" file in the node-rev subdirectory.
414 * For the directory node-rev representing the root of the
415 transaction, the "is-fresh-txn-root" field indicates that it has
416 not been made mutable yet (see Issue #2608).
418 * For file node-revs, the "text" field may have the value "-1
419 <offset> <length> <size> <digest>" if the text representation is
420 within the prototype rev file.
422 * The "copyroot" field may have the value "-1 <created-path>" if the
423 copy root of the node-rev is part of the transaction in process.
428 Locks in FSFS are stored in serialized hash format in files whose
429 names are MD5 digests of the FS path which the lock is associated
430 with. For the purposes of keeping directory inode usage down, these
431 digest files live in subdirectories of the main lock directory whose
432 names are the first 3 characters of the digest filename.
434 Also stored in the digest file for a given FS path are pointers to
435 other digest files which contain information associated with other FS
436 paths that are our path's immediate children.
438 To answer the question, "Does path FOO have a lock associated with
439 it?", one need only generate the MD5 digest of FOO's
440 absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look
441 for a file located like so:
443 /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8
445 And then see if that file contains lock information.
447 To inquire about locks on children of the path FOO, you would
448 reference the same path as above, but look for a list of children in
449 that file (instead of lock information). Children are listed as MD5
450 digests, too, so you would simply iterate over those digests and
451 consult the files they reference, and so on, recursively.