Update git submodules
[LibreOffice.git] / unoidl / README.md
blobd6421f8585d89445cb3feddec244bbbbc62336bd
1 # Support for UNOIDL Registry Formats
3 `Library_unoidl` contains the `unoidl::Manager` and `unoidl::Provider` implementations
4 for the following registry formats:
6 * The new `UNOIDL` binary `types.rdb` format.
7 * The old legacy binary `types.rdb` format (based on modules "store" and
8   "registry").
9 * A source-file format, reading (multiple) `UNOIDL` entity definitions directly
10   from a single `.idl` source file.
11 * A source-tree format, reading `UNOIDL` entity definitions directly from a tree
12   of `.idl` source files rooted at a given directory.  (Where an entity named
13   `foo.bar.Baz` is expected in a file named `foo/bar/Baz.idl` within that tree.)
15 (While `.idl` files still contain `#include` directives for legacy idlc, the source-
16 based formats ignore any preprocessing directives starting with `#` in the `.idl`
17 files.)  `unoidl::Manager::addProvider` transparently detects the registry format
18 for a given URI and instantiates the corresponding provider implementation.
20 `Executable_unoidl-write` is a helper tool to convert from any of the registry
21 formats to the `UNOIDL` format.  It is used at build-time to compile `UNOIDL` format
22 `.rdb` files (that are used at build-time only, or included in installation sets
23 in `URE` or `program/types/` or as part of bundled extensions that are created
24 during the build and not merely included as pre-built `.oxt` files) from source
25 `.idl` files.
27 `Executable_unoidl-read` is a helper tool to convert from any of the registry
28 formats to the source-file format.  It can be used manually after a LibreOffice
29 version update to create new reference registries for `Executable_unoidl-check`.
31 `Executable_unoidl-check` is a helper tool to check that one registry is
32 backwards-compatible with another registry.  It is used at build-time to detect
33 inadvertent breakage of the udkapi and offapi APIs.
35 ## Specification of the New UNOIDL types.rdb Format
37 The format uses byte-oriented, platform-independent, binary files.  Larger
38 quantities are stored LSB first, without alignment requirements.  Offsets are
39 32 bit, effectively limiting the overall file size to 4GB, but that is not
40 considered a limitation in practice (and avoids unnecessary bloat compared to
41 64 bit offsets).
43 Annotations can be added for (non-module) entities and certain parts of such
44 entities (e.g., both for an interface type definition and for a direct method of
45 an interface type definition; the idea is that it can be added for direct parts
46 that forma a "many-to-one" relationship; there is a tradeoff between generality
47 of concept and size of representation, esp. for the C++ representation types in
48 namespace `unoidl`) and consist of arbitrary sequences of name/value strings.
49 Each name/value string is encoded as a single UTF-8 string containing a name (an
50 arbitrary sequence of Unicode code points not containing `U+003D EQUALS SIGN`),
51 optionally followed by `U+003D EQUALS SIGN` and a value (an arbitrary sequence of
52 Unicode code points).  The only annotation name currently in use is "deprecated"
53 (without a value).
55 The following definitions are used throughout:
57 * `UInt16`: 2-byte value, LSB first
58 * `UInt32`: 4-byte value, LSB first
59 * `UInt64`: 8-byte value, LSB first
60 * Offset: `UInt32` value, counting bytes from start of file
61 * `NUL`-Name: zero or more non-`NUL` US-ASCII bytes followed by a `NUL` byte
62 * Len-String: UInt32 number of characters, with `0x80000000` bit 0, followed by
63    that many US-ASCII (for `UNOIDL` related names) resp. UTF-8 (for annotations)
64    bytes
65 * Idx-String: either an Offset (with `0x80000000` bit 1) of a Len-String, or a
66    Len-String
67 * Annotations: `UInt32` number `N` of annotations followed by `N * Idx-String`
68 * Entry: Offset of `NUL`-Name followed by Offset of payload
69 * Map: zero or more Entries
71 The file starts with an 8 byte header, followed by information about the root
72 map (`unoidl-write` generates files in a single depth-first pass, so the root map
73 itself is at the end of the file):
75 * 7 byte magic header `UNOIDL\xFF`
76 * version byte 0
77 * Offset of root Map
78 * `UInt32` number of entries of root Map
79 ...
81 Files generated by unoidl-write follow that by a
83     "\0** Created by LibreOffice " LIBO_VERSION_DOTTED " unoidl-write **\0"
85 banner (cf. `config_host/config_version.h.in`), as a debugging aid.  (Old versions
86 used `reg2unoidl` instead of `unoidl-write` in that banner.)
88 Layout of per-entry payload in the root or a module Map:
90 * kind byte:
92     * 0: module
93         * followed by:
94             * `UInt32` number `N1` of entries of Map
95             * `N1 * Entry`
97     * otherwise:
98         * `0x80` bit: 1 if published
99         * `0x40` bit: 1 if annotated
100         * `0x20` bit: flag (may only be 1 for certain kinds, see below)
101         * remaining bits:
103             * 1: enum type
104                 * followed by:
105                     * `UInt32` number N1 of members
106                     * `N1 * tuple` of:
107                         * `Idx-String`
108                         * `UInt32`
109                         * if annotated: Annotations
111             * 2: plain struct type (with base if flag is 1)
112                 * followed by:
113                     * if "with base": `Idx-String`
114                     * `UInt32` number `N1` of direct members
115                     * `N1 * tuple` of:
116                         * `Idx-String` name
117                         * `Idx-String` type
118                         * if annotated: Annotations
120             * 3: polymorphic struct type template
121                 * followed by:
122                     * `UInt32` number `N1` of type parameters
123                     * `N1 * Idx-String`
124                     * `UInt32` number `N2` of members
125                     * `N2 * tuple` of:
126                         * kind byte: `0x01` bit is 1 if parameterized type
127                         * `Idx-String` name
128                         * `Idx-String` type
129                         * if annotated: Annotations
131             * 4: exception type (with base if flag is 1)
132                 * followed by:
133                     * if "with base": `Idx-String`
134                     * `UInt32` number `N1` of direct members
135                     * `N1 * tuple` of:
136                         * `Idx-String` name
137                         * `Idx-String` type
138                         * if annotated: Annotations
140             * 5: interface type
141                 * followed by:
142                     * `UInt32` number `N1` of direct mandatory bases
143                     * `N1 * tuple` of:
144                         * `Idx-String`
145                         * if annotated: Annotations
146                     * `UInt32` number `N2` of direct optional bases
147                     * `N2 * tuple` of:
148                         * `Idx-String`
149                         * if annotated: Annotations
150                     * `UInt32` number `N3` of direct attributes
151                     * `N3 * tuple` of:
152                         * kind byte:
153                             * `0x02` bit: 1 if read-only
154                             * `0x01` bit: 1 if bound
155                         * `Idx-String` name
156                         * `Idx-String` type
157                         * `UInt32` number `N4` of get exceptions
158                         * `N4 * Idx-String`
159                         * `UInt32` number `N5` of set exceptions
160                         * `N5 * Idx-String`
161                         * if annotated: Annotations
162                     * `UInt32` number `N6` of direct methods
163                     * `N6 * tuple` of:
164                         * `Idx-String` name
165                         * `Idx-String` return type
166                         * `UInt32` number `N7` of parameters
167                         * `N7 * tuple` of:
168                             * direction byte: 0 for in, 1 for out, 2 for in-out
169                             * `Idx-String` name
170                             * `Idx-String` type
171                         * `UInt32` number `N8` of exceptions
172                         * N8 * Idx-String
173                         * if annotated: Annotations
175             * 6: typedef
176                 * followed by:
177                     * `Idx-String`
179             * 7: constant group
180                 * followed by:
181                     * `UInt32` number `N1` of entries of Map
182                     * `N1 * Entry`
184             * 8: single-interface--based service (with default constructor if flag is 1)
185                 * followed by:
186                     * `Idx-String`
187                     * if not "with default constructor":
188                         * `UInt32` number `N1` of constructors
189                         * `N1 * tuple` of:
190                             * `Idx-String`
191                             * `UInt32` number `N2` of parameters
192                             * `N2 * tuple` of
193                                 * kind byte: `0x04` bit is 1 if rest parameter
194                                 * `Idx-String` name
195                                 * `Idx-String` type
196                             * `UInt32` number `N3` of exceptions
197                             * `N3 * Idx-String`
198                             * if annotated: Annotations
200             * 9: accumulation-based service
201                 * followed by:
202                     * `UInt32` number `N1` of direct mandatory base services
203                     * `N1 * tuple` of:
204                         * `Idx-String`
205                         * if annotated: Annotations
206                     * `UInt32` number `N2` of direct optional base services
207                     * `N2 * tuple` of:
208                         * `Idx-String`
209                         * if annotated: Annotations
210                     * `UInt32` number `N3` of direct mandatory base interfaces
211                     * `N3 * tuple` of:
212                         * `Idx-String`
213                         * if annotated: Annotations
214                     * `UInt32` number `N4` of direct optional base interfaces
215                     * `N4 * tuple` of:
216                         * `Idx-String`
217                         * if annotated: Annotations
218                     * `UInt32` number `N5` of direct properties
219                     * `N5 * tuple` of:
220                         * `UInt16` kind:
221                             * `0x0100` bit: 1 if optional
222                             * `0x0080` bit: 1 if removable
223                             * `0x0040` bit: 1 if maybedefault
224                             * `0x0020` bit: 1 if maybeambiguous
225                             * `0x0010` bit: 1 if readonly
226                             * `0x0008` bit: 1 if transient
227                             * `0x0004` bit: 1 if constrained
228                             * `0x0002` bit: 1 if bound
229                             * `0x0001` bit: 1 if maybevoid
230                             * `Idx-String` name
231                             * `Idx-String` type
232                             * if annotated: Annotations
234             * 10: interface-based singleton
235                 * followed by:
236                 * `Idx-String`
238             * 11: service-based singleton
239                 * followed by:
240                     * `Idx-String`
242         * if annotated, followed by: Annotations
244 Layout of per-entry payload in a constant group Map:
246 * kind byte:
247     * `0x80` bit: 1 if annotated
248     * remaining bits:
250         * 0: `BOOLEAN`
251             * followed by value byte, 0 represents false, 1 represents true
253         * 1: `BYTE`
254             * followed by value byte, representing values with two's complement
256         * 2: `SHORT`
257             * followed by `UInt16` value, representing values with two's complement
259         * 3: `UNSIGNED SHORT`
260             * followed by `UInt16` value
262         * 4: `LONG`
263             * followed by `UInt32` value, representing values with two's complement
265         * 5: `UNSIGNED LONG`
266             * followed by `UInt32` value
268         * 6: `HYPER`
269             * followed by `UInt64` value, representing values with two's complement
271         * 7: `UNSIGNED HYPER`
272             * followed by `UInt64` value
274         * 8: `FLOAT`
275             * followed by 4-byte value, representing values in ISO 60599 binary32 format,
276       LSB first
278         * 9: `DOUBLE`
279             * followed by 8-byte value, representing values in ISO 60599 binary64 format,
280       LSB first
282 * if annotated, followed by: Annotations