1 This file describes the design, layouts, and file formats of a
2 libsvn_fs_fs repository.
7 In FSFS, each committed revision is represented as an immutable file
8 containing the new node-revisions, contents, and changed-path
9 information for the revision, plus a second, changeable file
10 containing the revision properties.
12 In contrast to the BDB back end, the contents of recent revision of
13 files are stored as deltas against earlier revisions, instead of the
14 other way around. This is less efficient for common-case checkouts,
15 but brings greater simplicity and robustness, as well as the
16 flexibility to make commits work without write access to existing
17 revisions. Skip-deltas and delta combination mitigate the checkout
20 In-progress transactions are represented with a prototype rev file
21 containing only the new text representations of files (appended to as
22 changed file contents come in), along with a separate file for each
23 node-revision, directory representation, or property representation
24 which has been changed or added in the transaction. During the final
25 stage of the commit, these separate files are marshalled onto the end
26 of the prototype rev file to form the immutable revision file.
28 Layout of the FS directory
29 --------------------------
31 The layout of the FS directory (the "db" subdirectory of the
34 revs/ Subdirectory containing revs
35 <shard>/ Shard directory, if sharding is in use (see below)
36 <revnum> File containing rev <revnum>
37 revprops/ Subdirectory containing rev-props
38 <shard>/ Shard directory, if sharding is in use (see below)
39 <revnum> File containing rev-props for <revnum>
40 transactions/ Subdirectory containing transactions
41 <txnid>.txn/ Directory containing transaction <txnid>
42 txn-protorevs/ Subdirectory containing transaction proto-revision files
43 <txnid>.rev Proto-revision file for transaction <txnid>
44 <txnid>.rev-lock Write lock for proto-rev file
45 txn-current File containing the next transaction key
46 locks/ Subdirectory containing locks
47 <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest
48 <digest> File containing locks/children for path with <digest>
49 node-origins/ Lazy cache of origin noderevs for nodes
50 <partial-nodeid> File containing noderev ID of origins of nodes
51 current File specifying current revision and next node/copy id
52 fs-type File identifying this filesystem as an FSFS filesystem
53 write-lock Empty file, locked to serialise writers
54 txn-current-lock Empty file, locked to serialise 'txn-current'
55 uuid File containing the UUID of the repository
56 format File containing the format number of this filesystem
58 Files in the revprops directory are in the hash dump format used by
61 The format of the "current" file is:
63 * Format 3 and above: a single line of the form
64 "<youngest-revision>\n" giving the youngest revision for the
67 * Format 2 and below: a single line of the form "<youngest-revision>
68 <next-node-id> <next-copy-id>\n" giving the youngest revision, the
69 next unique node-ID, and the next unique copy-ID for the
72 The "write-lock" file is an empty file which is locked before the
73 final stage of a commit and unlocked after the new "current" file has
74 been moved into place to indicate that a new revision is present. It
75 is also locked during a revprop propchange while the revprop file is
76 read in, mutated, and written out again. Note that readers are never
77 blocked by any operation - writers must ensure that the filesystem is
78 always in a consistent state.
80 The "txn-current" file is a file with a single line of text that
81 contains only a base-36 number. The current value will be used in the
82 next transaction name, along with the revision number the transaction
83 is based on. This sequence number ensures that transaction names are
84 not reused, even if the transaction is aborted and a new transaction
85 based on the same revision is begun. The only operation that FSFS
86 performs on this file is "get and increment"; the "txn-current-lock"
87 file is locked during this operation.
92 The "format" file defines what features are permitted within the
93 filesystem, and indicates changes that are not backward-compatible.
94 It serves the same purpose as the repository file of the same name.
96 The filesystem format file was introduced in Subversion 1.2, and so
97 will not be present if the repository was created with an older
98 version of Subversion. An absent format file should be interpreted as
99 indicating a format 1 filesystem.
101 The format file is a single line of the form "<format number>\n",
102 followed by any number of lines specifying 'format options' -
103 additional information about the filesystem's format. Each format
104 option line is of the form "<option>\n" or "<option> <parameters>\n".
106 Clients should raise an error if they encounter an option not
107 permitted by the format number in use.
109 The format numbers have the following meanings:
111 Format 1: (understood by Subversion 1.1+)
112 Delta representations in revision files must contain only svndiff0
113 data. No format options are permitted. No mechanism is provided to
114 prevent transaction name reuse. Proto-rev and its lock are stored
115 in transactions/<txnid>/rev and transactions/<txnid>/rev-lock.
116 Node-IDs and copy-IDs use the "current" file.
118 Format 2: (understood by Subversion 1.4+)
119 Delta representations in revision files may contain either svndiff0
120 or svndiff1 data. No format options are permitted. No mechanism is
121 provided to prevent transaction name reuse. Proto-rev and its lock
122 are stored in transactions/<txnid>/rev and
123 transactions/<txnid>/rev-lock. Node-IDs and copy-IDs use the
126 Format 3: (understood by Subversion 1.5+)
127 Delta representations in revision files may contain either svndiff0
128 or svndiff1 data. The 'layout' format option is permitted. To
129 prevent transaction name reuse, transaction names should be
130 generated using the transaction sequence number stored in the
131 txn-current file. Proto-rev and its lock are stored in
132 txn-protorevs/<txnid>.rev and txn-protorevs/<txnid>.rev-lock.
133 Node-IDs and copy-IDs do not use the "current" file. minfo-here and
134 minfo-count node-revision fields are maintained.
136 Filesystem format options
137 -------------------------
139 Currently, the only recognised format option is "layout", which
140 specifies the paths that will be used to store the revision files and
141 revision property files.
143 The "layout" option is followed by the name of the filesystem layout
144 and any required parameters. The default layout, if no "layout"
145 keyword is specified, is the 'linear' layout.
147 The known layouts, and the parameters they require, are as follows:
150 Revision files and rev-prop files are named after the revision they
151 represent, and are placed directly in the revs/ and revprops/
152 directories. r1234 will be represented by the revision file
153 revs/1234 and the rev-prop file revprops/1234.
155 "sharded <max-files-per-directory>"
156 Revision files and rev-prop files are named after the revision they
157 represent, and are placed in a subdirectory of the revs/ and
158 revprops/ directories named according to the 'shard' they belong to.
160 Shards are numbered from zero and contain between one and the
161 maximum number of files per directory specified in the layout's
164 For the "sharded 1000" layout, r1234 will be represented by the
165 revision file revs/1/1234 and rev-prop file revprops/1/1234. The
166 revs/0/ directory will contain revisions 0-999, revs/1/ will contain
167 1000-1999, and so on.
172 In order to support efficient lookup of node-revisions by their IDs
173 and to simplify the allocation of fresh node-IDs during a transaction,
174 we treat the fields of a node-ID in new and interesting ways.
176 Within a revision file, node-revs have a txn-id field of the form
177 "r<rev>/<offset>", to support easy lookup. New node-revision IDs
178 assigned within a transaction have the txn-id field of "t<txnid>".
180 When a new node-id or copy-id is assigned in a transaction, the ID
181 used is a "_" followed by a base36 number unique to the transaction.
182 During the final phase of a comment, node-revision IDs are rewritten
183 to have repository-wide unique node-ID and copy-ID fields, and to have
184 "r<rev>/<offset>" txn-id fields.
186 In Format 3 and above, this uniqueness is done by changing a temporary
187 id of "_<base36>" to "<rev>-<base36>". Note that this means that the
188 originating revision of a line of history or a copy can be determined
189 by looking at the node ID.
191 In Format 2 and below, the "current" file contains global base36
192 node-ID and copy-ID counters; during the commit, the counter value is
193 added to the transaction-specific base36 ID, and the value in
194 "current" is adjusted.
196 (It is legal for Format 3 repositories to contain Format 2-style IDs;
197 this just prevents I/O-less node-origin-rev lookup for those nodes.)
199 The temporary assignment of node-ID and copy-ID fields has
200 implications for svn_fs_compare_ids and svn_fs_check_related. The IDs
201 _1.0.t1 is not related to the ID _1.0.t2 even though they have the
202 same node-ID, because temporary node-IDs are restricted in scope to
203 the transactions they belong to.
205 There is a lazily created cache mapping from node-IDs to the full
206 node-revision ID where they are created. This is in the node-origins
207 directory; the file name is the node-ID without its last character (or
208 "0" for single-character node IDs) and the contents is a serialized
209 hash mapping from node-ID to node-revision ID. This cache is only
210 used for node-IDs of the pre-Format 3 style.
212 Copy-IDs and copy roots
213 -----------------------
215 Copy-IDs are assigned in the same manner as they are in the BDB
218 * A node-rev resulting from a creation operation (with no copy
219 history) receives the copy-ID of its parent directory.
221 * A node-rev resulting from a copy operation receives a fresh
222 copy-ID, as one would expect.
224 * A node-rev resulting from a modification operation receives a
225 copy-ID depending on whether its predecessor derives from a
226 copy operation or whether it derives from a creation operation
227 with no intervening copies:
229 - If the predecessor does not derive from a copy, the new
230 node-rev receives the copy-ID of its parent directory. If the
231 node-rev is being modified through its created-path, this will
232 be the same copy-ID as the predecessor node-rev has; however,
233 if the node-rev is being modified through a copied ancestor
234 directory (i.e. we are performing a "lazy copy"), this will be
237 - If the predecessor derives from a copy and the node-rev is
238 being modified through its created-path, the new node-rev
239 receives the copy-ID of the predecessor.
241 - If the predecessor derives from a copy and the node-rev is not
242 being modified through its created path, the new node-rev
243 receives a fresh copy-ID. This is called a "soft copy"
244 operation, as distinct from a "true copy" operation which was
245 actually requested through the svn_fs interface. Soft copies
246 exist to ensure that the same <node-ID,copy-ID> pair is not
247 used twice within a transaction.
249 Unlike the BDB implementation, we do not have a "copies" table.
250 Instead, each node-revision record contains a "copyroot" field
251 identifying the node-rev resulting from the true copy operation most
252 proximal to the node-rev. If the node-rev does not itself derive from
253 a copy operation, then the copyroot field identifies the copy of an
254 ancestor directory; if no ancestor directories derive from a copy
255 operation, then the copyroot field identifies the root directory of
261 A revision file contains a concatenation of various kinds of data:
263 * Text and property representations
265 * The changed-path data
266 * Two offsets at the very end
268 A representation begins with a line containing either "PLAIN\n" or
269 "DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>,
270 and <length> give the location of the delta base of the representation
271 and the amount of data it contains (not counting the header or
272 trailer). If no base location is given for a delta, the base is the
273 empty stream. After the initial line comes raw svndiff data, followed
274 by a cosmetic trailer "ENDREP\n".
276 If the a representation is for the text contents of a directory node,
277 the expanded contents are in hash dump format mapping entry names to
278 "<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives
279 the ID of the child node-rev.
281 If a representation is for a property list, the expanded contents are
282 in the form of a dumped hash map mapping property names to property
285 The marshalling syntax for node-revs is a series of fields terminated
286 by a blank line. Fields have the syntax "<name>: <value>\n", where
287 <name> is a symbolic field name (each symbolic name is used only once
288 in a given node-rev) and <value> is the value data. Unrecognized
289 fields are ignored, for extensibility. The following fields are
292 id The ID of the node-rev
294 pred The ID of the predecessor node-rev
295 count Count of node-revs since the base of the node
296 text "<rev> <offset> <length> <size> <digest>" for text rep
297 props "<rev> <offset> <length> <size> <digest>" for props rep
298 <rev> and <offset> give location of rep
299 <length> gives length of rep, sans header and trailer
300 <size> gives size of expanded rep
301 <digest> gives hex MD5 digest of expanded rep
302 cpath FS pathname node was created at
303 copyfrom "<rev> <path>" of copyfrom data
304 copyroot "<rev> <created-path>" of the root of this copy
305 minfo-cnt The number of nodes under (and including) this node
306 which have svn:mergeinfo.
307 minfo-here Exists if if this node itself has svn:mergeinfo.
309 The predecessor of a node-rev crosses both soft and true copies;
310 together with the count field, it allows efficient determination of
311 the base for skip-deltas. The first node-rev of a node contains no
312 "pred" field. A node-revision with no properties may omit the "props"
313 field. A node-revision with no contents (a zero-length file or an
314 empty directory) may omit the "text" field. In a node-revision
315 resulting from a true copy operation, the "copyfrom" field gives the
316 copyfrom data. The "copyroot" field identifies the root node-revision
317 of the copy; it may be omitted if the node-rev is its own copy root
318 (as is the case for node-revs with copy history, and for the root node
319 of revision 0). Copy roots are identified by revision and
320 created-path, not by node-rev ID, because a copy root may be a
321 node-rev which exists later on within the same revision file, meaning
322 its offset is not yet known.
324 The changed-path data is represented as a series of changed-path
325 items, each consisting of two lines. The first line has the format
326 "<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the
327 node-rev ID of the new node-rev, <action> is "add", "delete",
328 "replace", or "modify", <text-mod> and <prop-mod> are "true" or
329 "false" indicating whether the text and/or properties changed, and
330 <path> is the changed pathname. For deletes, <id> is the node-rev ID
331 of the deleted node-rev, and <text-mod> and <prop-mod> are always
332 "false". The second line has the format "<rev> <path>\n" containing
333 the node-rev's copyfrom information if it has any; if it does not, the
334 second line is blank.
336 At the very end of a rev file is a pair of lines containing
337 "\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of
338 the root directory node revision and <cp-offset> is the offset of the
341 All numbers in the rev file format are unsigned and are represented as
347 A transaction directory has the following layout:
349 props Transaction props
350 next-ids Next temporary node-ID and copy-ID
351 changes Changed-path information so far
352 node.<nid>.<cid> New node-rev data for node
353 node.<nid>.<cid>.props Props for new node-rev, if changed
354 node.<nid>.<cid>.children Directory contents for node-rev
356 In FS formats 1 and 2, it also contains:
358 rev Prototype rev file with new text reps
359 rev-lock Lockfile for writing to the above
361 In newer formats, these files are in the txn-protorevs/ directory.
363 The prototype rev file is used to store the text representations as
364 they are received from the client. To ensure that only one client is
365 writing to the file at a given time, the "rev-lock" file is locked for
366 the duration of each write.
368 The two kinds of props files are all in hash dump format. The "props"
369 file will always be present. The "node.<nid>.<cid>.props" file will
370 only be present if the node-rev properties have been changed.
372 The "next-ids" file contains a single line "<next-temp-node-id>
373 <next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID
374 assignments (without the leading underscores).
376 The "children" file for a node-revision begins with a copy of the hash
377 dump representation of the directory entries from the old node-rev (or
378 a dump of the empty hash for new directories), and then an incremental
379 hash dump entry for each change made to the directory.
381 The "changes" file contains changed-path entries in the same form as
382 the changed-path entries in a rev file, except that <id> and <action>
383 may both be "reset" (in which case <text-mod> and <prop-mod> are both
384 always "false") to indicate that all changes to a path should be
385 considered undone. Reset entries are only used during the final merge
386 phase of a transaction.
388 The node-rev files have the same format as node-revs in a revision
389 file, except that the "text" and "props" fields are augmented as
392 * The "props" field may have the value "-1" if properties have
393 been changed and are contained in a "props" file within the
394 node-rev subdirectory.
396 * For directory node-revs, the "text" field may have the value
397 "-1" if entries have been changed and are contained in a
398 "contents" file in the node-rev subdirectory.
400 * For the directory node-rev representing the root of the
401 transaction, the "is-fresh-txn-root" field indicates that it has
402 not been made mutable yet (see Issue #2608).
404 * For file node-revs, the "text" field may have the value "-1
405 <offset> <length> <size> <digest>" if the text representation is
406 within the prototype rev file.
408 * The "copyroot" field may have the value "-1 <created-path>" if the
409 copy root of the node-rev is part of the transaction in process.
414 Locks in FSFS are stored in serialized hash format in files whose
415 names are MD5 digests of the FS path which the lock is associated
416 with. For the purposes of keeping directory inode usage down, these
417 digest files live in subdirectories of the main lock directory whose
418 names are the first 3 characters of the digest filename.
420 Also stored in the digest file for a given FS path are pointers to
421 other digest files which contain information associated with other FS
422 paths that are our path's immediate children.
424 To answer the question, "Does path FOO have a lock associated with
425 it?", one need only generate the MD5 digest of FOO's
426 absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look
427 for a file located like so:
429 /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8
431 And then see if that file contains lock information.
433 To inquire about locks on children of the path FOO, you would
434 reference the same path as above, but look for a list of children in
435 that file (instead of lock information). Children are listed as MD5
436 digests, too, so you would simply iterate over those digests and
437 consult the files they reference, and so on, recursively.