1 tdb - a trivial database system
2 tridge@linuxcare.com December 1999
3 ==================================
5 This is a simple database API. It was inspired by the realisation that
6 in Samba we have several ad-hoc bits of code that essentially
7 implement small databases for sharing structures between parts of
8 Samba. As I was about to add another I realised that a generic
9 database module was called for to replace all the ad-hoc bits.
11 I based the interface on gdbm. I couldn't use gdbm as we need to be
12 able to have multiple writers to the databases at one time.
17 add HAVE_MMAP=1 to use mmap instead of read/write
18 add NOLOCK=1 to disable locking code
23 Compile tdbtest.c and link with gdbm for testing. tdbtest will perform
24 identical operations via tdb and gdbm then make sure the result is the
27 Also included is tdbtool, which allows simple database manipulation
30 tdbtest and tdbtool are not built as part of Samba, but are included
36 The interface is very similar to gdbm except for the following:
38 - different open interface. The tdb_open call is more similar to a
40 - no tdbm_reorganise() function
41 - no tdbm_sync() function. No operations are cached in the library anyway
42 - added a tdb_traverse() function for traversing the whole database
43 - added transactions support
45 A general rule for using tdb is that the caller frees any returned
46 TDB_DATA structures. Just call free(p.dptr) to free a TDB_DATA
47 return value called p. This is the same as gdbm.
49 here is a full list of tdb functions with brief descriptions.
52 ----------------------------------------------------------------------
53 TDB_CONTEXT *tdb_open(char *name, int hash_size, int tdb_flags,
54 int open_flags, mode_t mode)
56 open the database, creating it if necessary
58 The open_flags and mode are passed straight to the open call on the database
59 file. A flags value of O_WRONLY is invalid
61 The hash size is advisory, use zero for a default value.
63 return is NULL on error
65 possible tdb_flags are:
66 TDB_CLEAR_IF_FIRST - clear database if we are the only one with it open
67 TDB_INTERNAL - don't use a file, instead store the data in
68 memory. The filename is ignored in this case.
69 TDB_NOLOCK - don't do any locking
70 TDB_NOMMAP - don't use mmap
71 TDB_NOSYNC - don't synchronise transactions to disk
72 TDB_SEQNUM - maintain a sequence number
73 TDB_VOLATILE - activate the per-hashchain freelist, default 5
74 TDB_ALLOW_NESTING - allow transactions to nest
75 TDB_DISALLOW_NESTING - disallow transactions to nest
77 ----------------------------------------------------------------------
78 TDB_CONTEXT *tdb_open_ex(char *name, int hash_size, int tdb_flags,
79 int open_flags, mode_t mode,
80 const struct tdb_logging_context *log_ctx,
81 tdb_hash_func hash_fn)
83 This is like tdb_open(), but allows you to pass an initial logging and
84 hash function. Be careful when passing a hash function - all users of
85 the database must use the same hash function or you will get data
89 ----------------------------------------------------------------------
90 char *tdb_error(TDB_CONTEXT *tdb);
92 return a error string for the last tdb error
94 ----------------------------------------------------------------------
95 int tdb_close(TDB_CONTEXT *tdb);
99 ----------------------------------------------------------------------
100 TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key);
102 fetch an entry in the database given a key
103 if the return value has a null dptr then a error occurred
105 caller must free the resulting data
107 ----------------------------------------------------------------------
108 int tdb_parse_record(struct tdb_context *tdb, TDB_DATA key,
109 int (*parser)(TDB_DATA key, TDB_DATA data,
113 Hand a record to a parser function without allocating it.
115 This function is meant as a fast tdb_fetch alternative for large records
116 that are frequently read. The "key" and "data" arguments point directly
117 into the tdb shared memory, they are not aligned at any boundary.
119 WARNING: The parser is called while tdb holds a lock on the record. DO NOT
120 call other tdb routines from within the parser. Also, for good performance
121 you should make the parser fast to allow parallel operations.
123 tdb_parse_record returns -1 if the record was not found. If the record was
124 found, the return value of "parser" is passed up to the caller.
126 ----------------------------------------------------------------------
127 int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key);
129 check if an entry in the database exists
131 note that 1 is returned if the key is found and 0 is returned if not found
132 this doesn't match the conventions in the rest of this module, but is
135 ----------------------------------------------------------------------
136 int tdb_traverse(TDB_CONTEXT *tdb, int (*fn)(TDB_CONTEXT *tdb,
137 TDB_DATA key, TDB_DATA dbuf, void *state), void *state);
139 traverse the entire database - calling fn(tdb, key, data, state) on each
142 return -1 on error or the record count traversed
144 if fn is NULL then it is not called
146 a non-zero return value from fn() indicates that the traversal
147 should stop. Traversal callbacks may not start transactions.
149 WARNING: The data buffer given to the callback fn does NOT meet the
150 alignment restrictions malloc gives you.
152 ----------------------------------------------------------------------
153 int tdb_traverse_read(TDB_CONTEXT *tdb, int (*fn)(TDB_CONTEXT *tdb,
154 TDB_DATA key, TDB_DATA dbuf, void *state), void *state);
156 traverse the entire database - calling fn(tdb, key, data, state) on
157 each element, but marking the database read only during the
158 traversal, so any write operations will fail. This allows tdb to
159 use read locks, which increases the parallelism possible during the
162 return -1 on error or the record count traversed
164 if fn is NULL then it is not called
166 a non-zero return value from fn() indicates that the traversal
167 should stop. Traversal callbacks may not start transactions.
169 ----------------------------------------------------------------------
170 TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb);
172 find the first entry in the database and return its key
174 the caller must free the returned data
176 ----------------------------------------------------------------------
177 TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA key);
179 find the next entry in the database, returning its key
181 the caller must free the returned data
183 ----------------------------------------------------------------------
184 int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key);
186 delete an entry in the database given a key
188 ----------------------------------------------------------------------
189 int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag);
191 store an element in the database, replacing any existing element
194 If flag==TDB_INSERT then don't overwrite an existing entry
195 If flag==TDB_MODIFY then don't create a new entry
197 return 0 on success, -1 on failure
199 ----------------------------------------------------------------------
200 int tdb_writelock(TDB_CONTEXT *tdb);
202 lock the database. If we already have it locked then don't do anything
204 ----------------------------------------------------------------------
205 int tdb_writeunlock(TDB_CONTEXT *tdb);
208 ----------------------------------------------------------------------
209 int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key);
211 lock one hash chain. This is meant to be used to reduce locking
212 contention - it cannot guarantee how many records will be locked
214 ----------------------------------------------------------------------
215 int tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key);
217 unlock one hash chain
219 ----------------------------------------------------------------------
220 int tdb_transaction_start(TDB_CONTEXT *tdb)
222 start a transaction. All operations after the transaction start can
223 either be committed with tdb_transaction_commit() or cancelled with
224 tdb_transaction_cancel().
226 If you call tdb_transaction_start() again on the same tdb context
227 while a transaction is in progress, then the same transaction
228 buffer is re-used. The number of tdb_transaction_{commit,cancel}
229 operations must match the number of successful
230 tdb_transaction_start() calls.
232 Note that transactions are by default disk synchronous, and use a
233 recover area in the database to automatically recover the database
234 on the next open if the system crashes during a transaction. You
235 can disable the synchronous transaction recovery setup using the
236 TDB_NOSYNC flag, which will greatly speed up operations at the risk
237 of corrupting your database if the system crashes.
239 Operations made within a transaction are not visible to other users
240 of the database until a successful commit.
242 ----------------------------------------------------------------------
243 int tdb_transaction_cancel(TDB_CONTEXT *tdb)
245 cancel a current transaction, discarding all write and lock
246 operations that have been made since the transaction started.
249 ----------------------------------------------------------------------
250 int tdb_transaction_commit(TDB_CONTEXT *tdb)
252 commit a current transaction, updating the database and releasing
253 the transaction locks.
255 ----------------------------------------------------------------------
256 int tdb_transaction_prepare_commit(TDB_CONTEXT *tdb)
258 prepare to commit a current transaction, for two-phase commits.
259 Once prepared for commit, the only allowed calls are
260 tdb_transaction_commit() or tdb_transaction_cancel(). Preparing
261 allocates disk space for the pending updates, so a subsequent
262 commit should succeed (barring any hardware failures).
264 ----------------------------------------------------------------------
265 int tdb_check(TDB_CONTEXT *tdb,
266 int (*check)(TDB_DATA key, TDB_DATA data, void *private_data),
267 void *private_data);)
269 check the consistency of the database, calling back the check function
270 (if non-NULL) with each record. If some consistency check fails, or
271 the supplied check function returns -1, tdb_check returns -1, otherwise
272 0. Note that logging function (if set) will be called with additional
273 information on the corruption found.