docs/ikteam: Delete most files.
[haiku.git] / docs / develop / storage / resources / ResourcesFormat.tex
blob752db96f0576a3bdd87d8ee3a8895d79559aab37
1 \documentclass[12pt, a4paper]{article}
3 \usepackage{amssymb,amsmath,latexsym}
4 \usepackage[english]{babel}
5 \usepackage{hhline}
7 \newcommand{\code}[1]{{\tt #1}}
9 \newenvironment{nitemize}{
10 \newdimen\oldparindent
11 \oldparindent=\parindent
12 \begin{itemize}
13 \itemindent=-\oldparindent
15 \end{itemize}
18 %\newcommand{\codeblockspace}{\vspace{12pt}}
20 % begin/end a code block
21 \newcommand{\codeblockbegin}{\begin{flushleft}\begin{minipage}{\textwidth}}
22 \newcommand{\codeblockend}{\end{minipage}\end{flushleft}}
25 \begin{document}
27 \sloppy
29 \title{The Resources Format}
30 %\date{}
31 \author{Ingo Weinhold (bonefish@users.sf.net)}
33 \maketitle
34 \tableofcontents
37 \section{Introduction}
38 \label{introduction}
40 Resources provide a means to store structured but flat data in files. Unlike
41 attributes resources are part of the file contents and thus do not require a
42 special file system handling, but rather a special file format.
43 On the one hand there are formats of files that exclusively contain resources
44 (resource files), on the other hand these are file formats extended to
45 additionally contain resources -- namely the ELF and PEF object formats.
46 In either case the format of the chunk of data that frames the resources
47 themselves is the same. We call it the resources format.
49 Section \ref{file-formats} explains how the resources format is embedded in
50 different file formats. Section \ref{resources-format} discusses the resources
51 format itself. In section \ref{implementations} we focus on robustness of
52 resources reading/writing implementations.
53 The final section says some words about the status of the information provided
54 by this document.
58 \section{File Formats}
59 \label{file-formats}
61 In all file formats described in this section the resources are being located
62 at the end of the files. They are completely independent of their location.
65 \subsection{x86 Resource Files}
67 x86 resource files introduce the least overhead. The resources start directly
68 after the magic number identifying the file format:
70 \codeblockbegin
71 \begin{verbatim}
72 const char kX86ResourceFileMagic[4] = { 'R', 'S', 0, 0 };
73 const uint32 kX86ResourcesOffset = 0x00000004;
74 \end{verbatim}
75 \codeblockend
77 The resources start at \code{kX86ResourcesOffset}.
80 \subsection{PPC Resource Files}
82 PPC resource files begin with a PEF container header, after which the
83 resources start.
85 \codeblockbegin
86 \begin{verbatim}
87 typedef char PefOSType[4];
89 struct PEFContainerHeader {
90 PefOSType tag1;
91 PefOSType tag2;
92 PefOSType architecture;
93 uint32 formatVersion;
94 uint32 dateTimeStamp;
95 uint32 oldDefVersion;
96 uint32 oldImpVersion;
97 uint32 currentVersion;
98 uint16 sectionCount;
99 uint16 instSectionCount;
100 uint32 reservedA;
102 \end{verbatim}
103 \codeblockend
105 \codeblockbegin
106 \begin{verbatim}
107 const char kPEFFileMagic1[4] = { 'J', 'o', 'y', '!' };
108 const char kPPCResourceFileMagic[4] = { 'r', 'e', 's', 'f' };
109 const uint32 kPPCResourcesOffset = 0x00000028;
110 \end{verbatim}
111 \codeblockend
113 \begin{nitemize}
114 \item{\code{tag1}:
115 Must be \code{kPEFFileMagic1}.}
116 \item{\code{tag2}:
117 Must be \code{kPPCResourceFileMagic}.}
118 \item{All other fields must be set to 0.}
119 \end{nitemize}
121 \noindent
122 The resources start at \code{kPPCResourcesOffset}.
125 \subsection{ELF Object Files}
127 In an ELF file, resources are appended to rather than contained in the
128 regular data of the file. That is adding resources to an existing ELF file
129 will not cause any modification to its data (i.e. ELF header, program header
130 table, section header table or sections), but will enlarge the file by some
131 alignment padding and, of course, the resources themselves.
133 Therefore two values have to be known: The size of the actual ELF file and the
134 block size to which the resources must be aligned. As ELF files do not contain
135 a size field, it has to be deduced, where the file ends. This end offset is
136 supposed to be the maximum of the end offsets of ELF header, program header
137 table (if any), section header table (if any), sections and segments.
139 The block size to which the resources have to be aligned is the maximum of
140 \code{kELFMinResourceAlignment} and the alignments of the segments in the file.
142 \begin{verbatim}
143 const uint32 kELFMinResourceAlignment = 32;
144 \end{verbatim}
146 The data used for the padding between the end of the actual ELF data and the
147 beginning of the resources may be arbitrary.
150 \subsection{PEF Object Files}
152 Similar to ELF files the resources are simply appended to the regular data of
153 a PEF file, but they are not aligned to any value. That is the resources
154 start directly after the last PEF section without any padding.
155 As no field exists, that tells about the size of the PEF container (the
156 regular data), it has to be deduced by iterating through the PEF section
157 headers.
161 \section{The Resources Format}
162 \label{resources-format}
164 This section describes the resources format. After a subsection that outlines
165 their general layout, it follow subsections discussing the major parts.
167 A general remark regarding the byte ordering: Resources have no standard
168 endianess, that is the resources created by little endian and big endian
169 machines differ. Usually it should be possible, to deduce the used endianess
170 from the type of the file. x86 resource files contain little endian, PPC
171 resource files big endian data. The endianess of an ELF file is encoded in
172 its header.
174 As there is in fact no good reason to have different resource file formats,
175 even if they differ only in the format of the header (see section
176 \ref{file-formats}), it may be decided to use the x86 resource file format
177 also for big endian machines. Therefore the endianess may be deduced by the
178 first field of the resources header (\code{rh\_resources\_magic}, see
179 subsection \ref{resources-header}).
182 \subsection{Resources Layout}
184 The layout of the resources in a file is shown in figure
185 \ref{fig:resources-layout}.
187 \begin{figure}[h!tb]
188 \begin{center}
189 \begin{tabular}{|c|c|c|}
190 \hline
191 & \multicolumn{2}{c|}{resources header}\\
192 \hhline{|~==|}
193 admin section & & index section header \\
194 \hhline{~~|-|}
195 & index section & resource index \\
196 \hhline{~~|-|}
197 & & padding \\
198 \hhline{:===:}
199 \multicolumn{3}{|c|}{unknown section}\\
200 \hhline{:===:}
201 \multicolumn{3}{|c|}{data section}\\
202 \hhline{|~~~|}
203 \hhline{:===:}
204 \multicolumn{3}{|c|}{info section}\\
205 \hline
206 \end{tabular}
207 \end{center}
208 \caption{The Resources Layout.}
209 \label{fig:resources-layout}
210 \end{figure}
212 \noindent
213 There are four sections:
215 \begin{itemize}
216 \item{An administrative section which comprises the resources header and the
217 resource index subsection. The latter locates all other data in the file.
219 \item{An unknown section, whose purpose is (unsurprisingly) unknown, but which
220 seems to be unused, always containing the same data.
222 \item{A data section holding the actual resource data.
224 \item{An info section, which provides aditional information for each resource,
225 such as type, id and name.
227 \end{itemize}
230 \subsection{Resources Header}
231 \label{resources-header}
233 The resources header has the following structure:
235 \codeblockbegin
236 \begin{verbatim}
237 struct resources_header {
238 uint32 rh_resources_magic;
239 uint32 rh_resource_count;
240 uint32 rh_index_section_offset;
241 uint32 rh_admin_section_size;
242 uint32 rh_pad[13];
244 \end{verbatim}
245 \codeblockend
247 \codeblockbegin
248 \begin{verbatim}
249 const uint32 kResourcesHeaderMagic = 0x444f1000;
250 const uint32 kResourceIndexSectionOffset = 0x00000044;
251 const uint32 kResourceIndexSectionAlignment = 0x00000600;
252 \end{verbatim}
253 \codeblockend
255 \begin{nitemize}
256 \item{\code{rh\_resources\_magic}:
257 Must be \code{kResourcesHeaderMagic}.
259 \item{\code{rh\_resource\_count}:
260 Specifies the number of resources stored in this file. May be 0.
262 \item{\code{rh\_index\_section\_offset}:
263 Specifies the offset of the resource index section relative to the beginning
264 of the resources. An alternative interpretation may be the size of the
265 resources header.
266 Must be \code{kResourceIndexSectionOffset}.
268 \item{\code{rh\_admin\_section\_size}:
269 Specifies the size of the administrative section.
270 Must be \code{kResourceIndexSectionOffset} plus a multiple of
271 \code{kResourceIndexSectionAlignment}.
273 \item{\code{rh\_pad}:
274 Padding. \code{0x00000000} words.
276 \end{nitemize}
279 \subsection{Resource Index Section}
280 \label{resources-index}
282 The resource index section starts with a header, it follows a table of
283 \code{resource\_index\_entry} structures, that locates the data of each
284 resource, and the section ends with a special padding.
286 \noindent
287 The resource index header has the following structure:
289 \codeblockbegin
290 \begin{verbatim}
291 struct resource_index_section_header {
292 uint32 rish_index_section_offset;
293 uint32 rish_index_section_size;
294 uint32 rish_unused_data1;
295 uint32 rish_unknown_section_offset;
296 uint32 rish_unknown_section_size;
297 uint32 rish_unused_data2[25];
298 uint32 rish_info_table_offset;
299 uint32 rish_info_table_size;
300 uint32 rish_unused_data3;
302 \end{verbatim}
303 \codeblockend
305 \begin{verbatim}
306 const uint32 kUnknownResourceSectionSize = 0x00000168;
307 \end{verbatim}
309 \begin{nitemize}
310 \item{\code{rish\_index\_section\_offset}:
311 Specifies the offset of the resource index section relative to the beginning
312 of the resources. An alternative interpretation may be the size of the
313 resources header.
314 Must be \code{kResourceIndexSectionOffset}.
316 \item{\code{rish\_index\_section\_size}:
317 Specifies the size of the resource index section.
318 Must be a multiple of \code{kResourceIndexSectionAlignment}.
320 \item{\code{rish\_unused\_data1}:
321 Contains special data as described in section \ref{resources-unknown}.
323 \item{\code{rish\_unknown\_section\_offset}:
324 Specifies the offset of the unknown section relative to the beginning
325 of the resources.
326 Must be the same value as given in the resources header for
327 \code{rh\_admin\_section\_size}.
329 \item{\code{rish\_unknown\_section\_size}:
330 Specifies the offset of the unknown section relative to the beginning
331 of the resources.
332 Must be \code{kUnknownResourceSectionSize};
334 \item{\code{rish\_unused\_data2}:
335 Contains special data as described in section \ref{resources-unknown}.
337 \item{\code{rish\_info\_table\_offset}:
338 Specifies the offset of the resource info table relative to the beginning
339 of the resources.
341 \item{\code{rish\_info\_table\_size}:
342 Specifies the size of the resource info table.
344 \item{\code{rish\_unused\_data3}:
345 Contains special data as described in section \ref{resources-unknown}.
347 \end{nitemize}
349 Directly, without padding, it follows a table of \code{resource\_index\_entry}
350 structures. The number of entries in the table is the number of resources
351 stored in the file, that is the value specified by the
352 \code{rh\_resource\_count} member of the resources header. Since the entries
353 are stored without padding, the size of the table is exactly the product of
354 the size of \code{resource\_index\_entry} and the number of resources.
355 If the latter is 0, the table takes no space.
357 \codeblockbegin
358 \begin{verbatim}
359 struct resource_index_entry {
360 uint32 rie_offset;
361 uint32 rie_size;
362 uint32 rie_pad;
364 \end{verbatim}
365 \codeblockend
367 \begin{nitemize}
368 \item{\code{rie\_offset}:
369 Specifies the offset of the resource data relative to the beginning
370 of the resources.
372 \item{\code{rie\_size}:
373 Specifies the size of the resource data.
375 \item{\code{rie\_pad}:
376 Padding. Must be \code{0x00000000}.
378 \end{nitemize}
380 Since the size of the resource index section must be a multiple of
381 \code{kResourceIndexSectionAlignment}, some padding may be needed at the end
382 of this section. How this padding looks like is described in section
383 \ref{resources-unknown}.
386 \subsection{Unknown Section}
387 \label{resources-unknown}
389 The meaning of this section is unknown. It does not seem to be used at all.
390 It always contains the same data given by \code{kUnusedResourceDataPattern}:
392 \codeblockbegin
393 \begin{verbatim}
394 const uint32 kUnusedResourceDataPattern[3] = {
395 0xffffffff, 0x000003e9, 0x00000000
397 \end{verbatim}
398 \codeblockend
400 In section \ref{resources-index} some members where named \code{unused\_data}.
401 These fields contain the same kind of data. To understand what the value for a
402 certain field of this type is, it may help to imagine, that before the
403 resources are written to a file, the space they will take is filled with the
404 pattern specified by \code{kUnusedResourceDataPattern}, and that only those
405 fields are written that are not unused. Thus the original pattern can be seen
406 through at the unused locations.
408 To be precise: Let \verb|uint32 resources[]| be the resources and
409 \code{index} the index of an unused field in \code{resources}, then it holds:
411 \begin{verbatim}
412 resources[index] == kUnusedResourceDataPattern[index % 3];
413 \end{verbatim}
417 \subsection{Resource Info Table}
418 \label{resources-infotable}
420 The resource info table features exactly one entry for each resource.
421 Such an entry (resource info) specifies the ID and name of a
422 resource. Subsequent infos for resources of the same type are collected in
423 a block that starts with a type field.
425 The following grammar specifies the layout of the resource info table.
426 Nonterminals start with an upper case, terminals with a lower case letter.
428 \begin{verbatim}
429 ResourceInfoTable ::= [ ResourceBlockList ]
430 ResourceInfoSeparator
431 ResourceInfoTableEnd
433 ResourceBlockList ::= ResourceBlock
434 [ ResourceInfoSeparator
435 ResourceBlockList ]
437 ResourceBlock ::= type ResourceInfoList
439 ResourceInfoList ::= ResourceInfo [ ResourceInfoList ]
441 ResourceInfo ::= id index name_size name
443 ResourceInfoSeparator ::= 0xffffffff 0xffffffff
445 ResourceInfoTableEnd ::= check_sum 0x00000000
446 \end{verbatim}
448 The relevant structures follow:
450 \codeblockbegin
451 \begin{verbatim}
452 struct resource_info_block {
453 type_code rib_type;
454 resource_info rib_info[1];
456 \end{verbatim}
457 \codeblockend
459 \begin{nitemize}
460 \item{\code{rib\_type}:
461 Specifies the type of the resources in the block.
463 \item{\code{rib\_info}:
464 Is the first resource info of the block. More infos may follow.
466 \end{nitemize}
468 \codeblockbegin
469 \begin{verbatim}
470 struct resource_info {
471 int32 ri_id;
472 int32 ri_index;
473 uint16 ri_name_size;
474 char ri_name[1];
476 \end{verbatim}
477 \codeblockend
479 \begin{verbatim}
480 const uint32 kMinResourceInfoSize = 10;
481 \end{verbatim}
483 \begin{nitemize}
484 \item{\code{ri\_id}:
485 Specifies the ID of the resource.
487 \item{\code{ri\_index}:
488 Specifies the index of the resource this resource info refers to.
490 \item{\code{ri\_name\_size}:
491 Specifies the size of the resource name. May be 0 -- then the resource does
492 not have a name and \code{ri\_name} has a size of 0.
494 \item{\code{ri\_name}:
495 Specifies the name of the resource. The name must be null terminated.
496 \code{ri\_name\_size} specifies the size of this field (including the
497 terminating null). If it is 0, the resource does not have a name and
498 \code{ri\_name} is empty, i.e. has size 0.
500 \item{\code{kMinResourceInfoSize}:
501 Is the minimal size of a resource info. That is the size it has, if the
502 resource does not have a name.
504 \end{nitemize}
506 \codeblockbegin
507 \begin{verbatim}
508 struct resource_info_separator {
509 uint32 ris_value1;
510 uint32 ris_value2;
512 \end{verbatim}
513 \codeblockend
515 \begin{nitemize}
516 \item{\code{ris\_value1}:
517 Specifies the first word of the separator.
518 Must be \code{0xffffffff}.
520 \item{\code{ris\_value2}:
521 Specifies the second word of the separator.
522 Must be \code{0xffffffff}.
524 \end{nitemize}
526 \codeblockbegin
527 \begin{verbatim}
528 struct resource_info_table_end {
529 uint32 rite_check_sum;
530 uint32 rite_terminator;
532 \end{verbatim}
533 \codeblockend
535 \begin{nitemize}
536 \item{\code{rite\_check\_sum}:
537 Contains the check sum for the resource info table. The check sum is
538 calculated from all bytes of the resource info table not including
539 \code{rite\_check\_sum} and \code{rite\_terminator}. The data are grouped
540 into four byte blocks, which are interpreted as big endian unsigned words
541 and summed up, ignoring carry. If the number of bytes to be considered is
542 not dividable by four, the remaining bytes are interpreted as the lower
543 bytes of a big endian unsigned word (the upper byte(s) set to 0).
545 \item{\code{rite\_terminator}:
546 Terminates the resource info table.
547 Must be \code{0x00000000}.
549 \end{nitemize}
553 \section{Implementations}
554 \label{implementations}
556 Code that writes resources should strictly stick to the specification
557 presented in the preceding sections to achieve maximal compatibility.
559 Resources reading implementations may tolerate certain deviations that
560 for instance happen to occur in several files of the BeOS R5 distribution
561 and that are handled gracefully by xres and QuickRes. It follows a, possibly
562 incomplete, list:
564 \begin{itemize}
565 \item{The third and fourth byte of the x86 resource file magic (the 0 bytes)
566 may be arbitrary bytes.}
567 \item{\code{rh\_resource\_count} may be unreliable. The resource index table
568 should be read until its end, which is either marked by the unused data
569 pattern (see section \ref{resources-unknown}) or at the latest by the
570 beginning of the unknown section.}
571 \item{The resource info table may contain entries for indices that are out
572 of range, i.e. greater than the number of resources induced by the resource
573 index table. Those entries should be ignored.}
574 \item{The resource info table may contain multiple entries for an index.
575 Any such entry after the first one should be ignored.}
576 \item{The resource info table may not contain an entry for an index.
577 The respective resource should be ignored.}
578 \item{The resource info table may not contain a \code{ResourceInfoTableEnd}
579 (see section \ref{resources-infotable}) and thus no check sum. The table
580 should be accepted nevertheless. Note, that a table containing a wrong check
581 sum is {\em not} to be accepted.}
582 \end{itemize}
586 \section{Status of this Document}
587 \label{status}
589 The information contained in this document are obtained by analyzing
590 resources-containing files created or modified by tools available for BeOS R5,
591 namely QuickRes and xres. They are incomplete and may even be partially wrong,
592 where being based on incorrect assumptions.
594 \noindent
595 It follows a list of items with a low degree of reliance:
596 \begin{itemize}
597 \item{Resources alignment in ELF files: Several tests with linker object files
598 have shown, that QuickRes aligns their resources offset to 32 bytes.
599 For executables on the other hand the alignment was always 4096, which is
600 the usual memory page size of current x86 architectures and therefore the
601 preferred program segment alignment. From these two information it has been
602 deduced, that the alignment is, if present, the maximum of
603 the segment alignments to be found in the program header table,
604 but at minimum 32.}
605 \item{The resources layout: The general layout of the resources is not very
606 well understood. The layout presented in figure \ref{fig:resources-layout}
607 resulted from the attempt to assign all the fields a reasonable meaning, but
608 in fact not even the exact length and meaning of the fields of the resources
609 header is unclear. The same holds for the resource index section header.}
610 \item{The unknown section: The contents of the unknown section and of unknown
611 fields is base on educated guesses.}
612 \end{itemize}
614 \end{document}