1 \input texinfo @c -*- Texinfo -*-
2 @setfilename ctf-spec.info
3 @settitle The CTF File Format
5 @xrefautomaticsectiontitle on
12 Copyright @copyright{} 2021-2024 Free Software Foundation, Inc.
14 Permission is granted to copy, distribute and/or modify this document
15 under the terms of the GNU General Public License, Version 3 or any
16 later version published by the Free Software Foundation. A copy of the
17 license is included in the section entitled ``GNU General Public
22 @dircategory Software development
24 * CTF: (ctf-spec). The CTF file format.
28 @title The CTF File Format
33 @vskip 0pt plus 1filll
40 @top The CTF file format
42 This manual describes version 3 of the CTF file format, which is
43 intended to model the C type system in a fashion that C programs can
51 The CTF file format compactly describes C types and the association
52 between function and data symbols and types: if embedded in ELF objects,
53 it can exploit the ELF string table to reduce duplication further.
54 There is no real concept of namespacing: only top-level types are
55 described, not types scoped to within single functions.
57 CTF dictionaries can be @dfn{children} of other dictionaries, in a
58 one-level hierarchy: child dictionaries can refer to types in the
59 parent, but the opposite is not sensible (since if you refer to a child
60 type in the parent, the actual type you cited would vary depending on
61 what child was attached). This parent/child definition is recorded in
62 the child, but only as a recommendation: users of the API have to attach
63 parents to children explicitly, and can choose to attach a child to any
64 parent they like, or to none, though doing so might lead to unpleasant
65 consequences like dangling references to types. @xref{Type indexes and
66 type IDs}. Type lookups in child dicts that are not associated with a
67 parent at all will fail with @code{ECTF_NOPARENT} if a parent type was
70 The associated API to generate, merge together, and query this file
71 format will be described in the accompanying @code{libctf} manual once
72 it is written. There is no API to modify dictionaries once they've been
73 written out: CTF is a write-once file format. (However, it is always
74 possible to dynamically create a new child dictionary on the fly and
75 attach it to a pre-existing, read-only parent.)
77 There are two major pieces to CTF: the @dfn{archive} and the
78 @dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries
79 @dfn{containers}: the archive format is unique to this variant of CTF.
80 (Much of the source code still uses the old term.)
82 The archive file format is a very simple mmappable archive used to group
83 multiple dictionaries together into groups: it is expected to slowly go
84 away and be replaced by other mechanisms, but right now it is an
85 important part of the file format, used to group dictionaries containing
86 types with conflicting definitions in different TUs with the overarching
87 dictionary used to store all other types. (Even when archives go away,
88 the @code{libctf} API used to access them will remain, and access the
89 other mechanisms that replace it instead.)
91 The CTF dictionary consists of a @dfn{preamble}, which does not vary
92 between versions of the CTF file format, and a @dfn{header} and some
93 number of @dfn{sections}, which can vary between versions.
95 The rest of this specification describes the format of these sections,
96 first for the latest version of CTF, then for all earlier versions
97 supported by @code{libctf}: the earlier versions are defined in terms of
98 their differences from the next later one. We describe each part of the
99 format first by reproducing the C structure which defines that part,
100 then describing it at greater length in terms of file offsets.
102 The description of the file format ends with a description of relevant
103 limits that apply to it. These limits can vary between file format
106 This document is quite young, so for now the C code in @file{ctf.h}
107 should be presumed correct when this document conflicts with it.
110 @chapter CTF archives
111 @cindex archive, CTF archive
113 The CTF archive format maps names to CTF dictionaries. The names may
114 contain any character other than \0, but for now archives containing
115 slashes in the names may not extract correctly. It is possible to
116 insert multiple members with the same name, but these are quite hard to
117 access reliably (you have to iterate through all the members rather than
118 opening by name) so this is not recommended.
120 CTF archives are not themselves compressed: the constituent components,
121 CTF dictionaries, can be compressed. (@xref{CTF header}).
123 CTF archives usually contain a collection of related dictionaries, one
124 parent and many children of that parent. CTF archives can have a member
125 with a @dfn{default name}, @code{.ctf} (which can be represented as
126 @code{NULL} in the API). If present, this member is usually the parent
127 of all the children, but it is possible for CTF producers to emit
128 parents with different names if they wish (usually for backward-
129 compatibility purposes).
131 @code{.ctf} sections in ELF objects consist of a single CTF dictionary
132 rather than an archive of dictionaries if and only if the section
133 contains no types with identical names but conflicting definitions: if
134 two conflicting definitions exist, the deduplicator will place the type
135 most commonly referred to by other types in the parent and will place
136 the other type in a child named after the translation unit it is found
137 in, and will emit a CTF archive containing both dictionaries instead of
138 a raw dictionary. All types that refer to such conflicting types are
139 also placed in the per-translation-unit child.
141 The definition of an archive in @file{ctf.h} is as follows:
148 uint64_t ctfa_nfiles;
153 typedef struct ctf_archive_modent
155 uint64_t name_offset;
157 } ctf_archive_modent_t;
160 (Note one irregularity here: the @code{ctf_archive_t} is not a typedef
161 to @code{struct ctf_archive}, but a different typedef, private to
162 @code{libctf}, so that things that are not really archives can be made
163 to appear as if they were.)
165 All the above items are always in little-endian byte order, regardless
166 of the machine endianness.
168 The archive header has the following fields:
170 @tindex struct ctf_archive
171 @multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer}
172 @headitem Offset @tab Name @tab Description
174 @tab @code{uint64_t ctfa_magic}
176 @vindex struct ctf_archive, ctfa_magic
177 @tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb.
181 @tab @code{uint64_t ctfa_model}
183 @vindex struct ctf_archive, ctfa_model
184 @tab The data model for this archive: an arbitrary integer that serves no
185 purpose but to be handed back by the libctf API. @xref{Data models}.
188 @tab @code{uint64_t ctfa_nfiles}
190 @vindex struct ctf_archive, ctfa_nfiles
191 @tab The number of CTF dictionaries in this archive.
194 @tab @code{uint64_t ctfa_names}
196 @vindex struct ctf_archive, ctfa_names
197 @tab Offset of the name table, in bytes from the start of the archive.
198 The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}.
201 @tab @code{uint64_t ctfa_ctfs}
203 @vindex struct ctf_archive, ctfa_ctfs
204 @tab Offset of the CTF table. Each element starts with a @code{uint64_t} size,
205 followed by a CTF dictionary.
209 The array pointed to by @code{ctfa_names} is an array of entries of
210 @code{ctf_archive_modent}:
212 @tindex struct ctf_archive_modent
213 @tindex ctf_archive_modent_t
214 @multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start}
215 @headitem Offset @tab Name @tab Description
217 @tab @code{uint64_t name_offset}
219 @vindex struct ctf_archive_modent, name_offset
220 @vindex ctf_archive_modent_t, name_offset
221 @tab Offset of this name, in bytes from the start of the archive.
224 @tab @code{uint64_t ctf_offset}
226 @vindex struct ctf_archive_modent, ctf_offset
227 @vindex ctf_archive_modent_t, ctf_offset
228 @tab Offset of this CTF dictionary, in bytes from the start of the archive.
232 The @code{ctfa_names} array is sorted into ASCIIbetical order by name
233 (i.e. by the result of dereferencing the @code{name_offset}).
235 The archive file also contains a name table and a table of CTF
236 dictionaries: these are pointed to by the structures above. The name
237 table is a simple strtab which is not required to be sorted; the
238 dictionary array is described above in the entry for @code{ctfa_ctfs}.
240 The relative order of these various parts is not defined, except that
241 the header naturally always comes first.
243 @node CTF dictionaries
244 @chapter CTF dictionaries
245 @cindex dictionary, CTF dictionary
247 CTF dictionaries consist of a header, starting with a premable, and a
251 @section CTF Preamble
253 The preamble is the only part of the CTF dictionary whose format cannot
254 vary between versions. It is never compressed. It is correspondingly
258 typedef struct ctf_preamble
260 unsigned short ctp_magic;
261 unsigned char ctp_version;
262 unsigned char ctp_flags;
266 @code{#define}s are provided under the names @code{cth_magic},
267 @code{cth_version} and @code{cth_flags} to make the fields of the
268 @code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so
269 consuming programs rarely need to consider the existence of the preamble
270 as a separate structure.
272 @tindex struct ctf_preamble
273 @tindex ctf_preamble_t
274 @multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries}
275 @headitem Offset @tab Name @tab Description
277 @tab @code{unsigned short ctp_magic}
280 @vindex ctf_preamble_t, ctp_magic
281 @vindex struct ctf_preamble, ctp_magic
282 @vindex ctf_header_t, cth_magic
283 @vindex struct ctf_header, cth_magic
284 @tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2.
288 @tab @code {unsigned char ctp_version}
291 @vindex ctf_preamble_t, ctp_version
292 @vindex struct ctf_preamble, ctp_version
293 @vindex ctf_header_t, cth_version
294 @vindex struct ctf_header, cth_version
295 @tab The version number of this CTF dictionary.
298 @tab @code{ctp_flags}
301 @vindex ctf_preamble_t, ctp_flags
302 @vindex struct ctf_preamble, ctp_flags
303 @vindex ctf_header_t, cth_flags
304 @vindex struct ctf_header, cth_flags
305 @tab Flags for this CTF file. @xref{CTF file-wide flags}.
309 Every element of a dictionary must be naturally aligned unless otherwise
310 specified. (This restriction will be lifted in later versions.)
313 CTF dictionaries are stored in the native endianness of the system that
314 generates them: the consumer (e.g., @code{libctf}) can detect whether to
315 endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it
316 appears as 0xf2df, endian-flipping is needed.)
318 The version of the CTF dictionary can be determined by inspecting
319 @code{ctp_version}. The following versions are currently valid, and
320 @code{libctf} can read all of them:
322 @tindex CTF_VERSION_3
323 @cindex CTF versions, versions
324 @multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.}
325 @headitem Version @tab Number @tab Description
326 @item @code{CTF_VERSION_1}
327 @tab 1 @tab First version, rare. Very similar to Solaris CTF.
329 @item @code{CTF_VERSION_1_UPGRADED_3}
330 @tab 2 @tab First version, upgraded to v3 or higher and written out again.
331 Name may change. Very rare.
333 @item @code{CTF_VERSION_2}
334 @tab 3 @tab Second version, with many range limits lifted.
336 @item @code{CTF_VERSION_3}
337 @tab 4 @tab Third and current version, documented here.
340 This section documents @code{CTF_VERSION_3}.
343 @node CTF file-wide flags
344 @subsection CTF file-wide flags
346 The preamble contains bitflags in its @code{ctp_flags} field that
347 describe various file-wide properties. Some of the flags are valid only
348 for particular file-format versions, which means the flags can be used
349 to fix file-format bugs. Consumers that see unknown flags should
350 accordingly assume that the dictionary is not comprehensible, and
353 The following flags are currently defined. Many are bug workarounds,
354 valid only in CTFv3, and will not be valid in any future versions: the
355 same values may be reused for other flags in v4+.
357 @multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the}
358 @headitem Flag @tab Versions @tab Value @tab Meaning
359 @tindex CTF_F_COMPRESS
360 @item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib
361 @tindex CTF_F_NEWFUNCINFO
362 @item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2
363 @tab ``New-format'' func info section.
364 @tindex CTF_F_IDXSORTED
365 @item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is
368 @item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is
369 in @code{.dynstr} and the symtab used is @code{.dynsym}.
370 @xref{The string section}
373 @code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the
374 function info and data object sections. @xref{The symtypetab sections}.
376 Further flags (and further compression methods) wil be added in future.
381 @cindex Sections, header
383 The CTF header is the first part of a CTF dictionary, including the
384 preamble. All parts of it other than the preamble (@pxref{CTF Preamble})
385 can vary between CTF file versions and are never compressed. It
386 contains things that apply to the dictionary as a whole, and a table of
387 the sections into which the rest of the dictionary is divided. The
388 sections tile the file: each section runs from the offset given until
389 the start of the next section. Only the last section cannot follow this
390 rule, so the header has a length for it instead.
392 All section offsets, here and in the rest of the CTF file, are relative to the
393 @emph{end} of the header. (This is annoyingly different to how offsets in CTF
394 archives are handled.)
396 This is the first structure to include offsets into the string table, which are
397 not straight references because CTF dictionaries can include references into the
398 ELF string table to save space, as well as into the string table internal to the
399 CTF dictionary. @xref{The string section} for more on these. Offset 0 is
400 always the null string.
403 typedef struct ctf_header
405 ctf_preamble_t cth_preamble;
406 uint32_t cth_parlabel;
407 uint32_t cth_parname;
410 uint32_t cth_objtoff;
411 uint32_t cth_funcoff;
412 uint32_t cth_objtidxoff;
413 uint32_t cth_funcidxoff;
415 uint32_t cth_typeoff;
423 @tindex struct ctf_header
425 @multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against}
426 @headitem Offset @tab Name @tab Description
428 @tab @code{ctf_preamble_t cth_preamble}
430 @vindex struct ctf_header, cth_preamble
431 @vindex ctf_header_t, cth_preamble
432 @tab The preamble (conceptually embedded in the header). @xref{CTF Preamble}
435 @tab @code{uint32_t cth_parlabel}
437 @vindex struct ctf_header, cth_parlabel
438 @vindex ctf_header_t, cth_parlabel
439 @tab The parent label, if deduplication happened against a specific label: a
440 strtab offset. @xref{The label section}. Currently unused and always 0, but may
441 be used in future when semantics are attached to the label section.
444 @tab @code{uint32_t cth_parname}
446 @vindex struct ctf_header, cth_parname
447 @vindex ctf_header_t, cth_parname
448 @tab The name of the parent dictionary deduplicated against: a strtab offset.
449 Interpretation is up to the consumer (usually a CTF archive member name). 0
450 (the null string) if this is not a child dictionary.
453 @tab @code{uint32_t cth_cuname}
455 @vindex struct ctf_header, cth_cuname
456 @vindex ctf_header_t, cth_cuname
457 @tab The name of the compilation unit, for consumers like GDB that want to
458 know the name of CUs associated with single CUs: a strtab offset. 0 if this
459 dictionary describes types from many CUs.
462 @tab @code{uint32_t cth_lbloff}
464 @vindex struct ctf_header, cth_lbloff
465 @vindex ctf_header_t, cth_lbloff
466 @tab The offset of the label section, which tiles the type space into
467 named regions. @xref{The label section}.
470 @tab @code{uint32_t cth_objtoff}
472 @vindex struct ctf_header, cth_objtoff
473 @vindex ctf_header_t, cth_objtoff
474 @tab The offset of the data object symtypetab section, which maps ELF data symbols to
475 types. @xref{The symtypetab sections}.
478 @tab @code{uint32_t cth_funcoff}
480 @vindex struct ctf_header, cth_funcoff
481 @vindex ctf_header_t, cth_funcoff
482 @tab The offset of the function info symtypetab section, which maps ELF function
483 symbols to a return type and arg types. @xref{The symtypetab sections}.
486 @tab @code{uint32_t cth_objtidxoff}
487 @vindex cth_objtidxoff
488 @vindex struct ctf_header, cth_objtidxoff
489 @vindex ctf_header_t, cth_objtidxoff
490 @tab The offset of the object index section, which maps ELF object symbols to
491 entries in the data object section. @xref{The symtypetab sections}.
494 @tab @code{uint32_t cth_funcidxoff}
495 @vindex cth_funcidxoff
496 @vindex struct ctf_header, cth_funcidxoff
497 @vindex ctf_header_t, cth_funcidxoff
498 @tab The offset of the function info index section, which maps ELF function
499 symbols to entries in the function info section. @xref{The symtypetab sections}.
502 @tab @code{uint32_t cth_varoff}
504 @vindex struct ctf_header, cth_varoff
505 @vindex ctf_header_t, cth_varoff
506 @tab The offset of the variable section, which maps string names to types.
507 @xref{The variable section}.
510 @tab @code{uint32_t cth_typeoff}
512 @vindex struct ctf_header, cth_typeoff
513 @vindex ctf_header_t, cth_typeoff
514 @tab The offset of the type section, the core of CTF, which describes types
515 using variable-length array elements. @xref{The type section}.
518 @tab @code{uint32_t cth_stroff}
520 @vindex struct ctf_header, cth_stroff
521 @vindex ctf_header_t, cth_stroff
522 @tab The offset of the string section. @xref{The string section}.
525 @tab @code{uint32_t cth_strlen}
527 @vindex struct ctf_header, cth_strlen
528 @vindex ctf_header_t, cth_strlen
529 @tab The length of the string section (not an offset!). The CTF file ends
534 Everything from this point on (until the end of the file at @code{cth_stroff} +
535 @code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in
536 the preamble's @code{ctp_flags}.
538 @node The type section
539 @section The type section
541 @cindex Sections, type
543 This section is the most important section in CTF, describing all the top-level
544 types in the program. It consists of an array of type structures, each of which
545 describes a type of some @dfn{kind}: each kind of type has some amount of
546 variable-length data associated with it (some kinds have none). The amount of
547 variable-length data associated with a given type can be determined by
548 inspecting the type, so the reading code can walk through the types in sequence
551 Each type structure is one of a set of overlapping structures in a discriminated
552 union of sorts: the variable-length data for each type immediately follows the
553 type's type structure. Here's the largest of the overlapping structures, which
554 is only needed for huge types and so is very rarely seen:
557 typedef struct ctf_type
567 uint32_t ctt_lsizehi;
568 uint32_t ctt_lsizelo;
572 Here's the much more common smaller form:
575 typedef struct ctf_stype
588 If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type
589 is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}.
590 @tindex CTF_LSIZE_SENT
592 Here's what the fields mean:
594 @tindex struct ctf_type
595 @tindex struct ctf_stype
598 @multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for}
599 @headitem Offset @tab Name @tab Description
601 @tab @code{uint32_t ctt_name}
603 @tab Strtab offset of the type name, if any (0 if none).
606 @tab @code{uint32_t ctt_info}
608 @vindex struct ctf_type, ctt_info
609 @vindex ctf_type_t, ctt_info
610 @vindex struct ctf_stype, ctt_info
611 @vindex ctf_stype_t, ctt_info
612 @tab The @dfn{info word}, containing information on the kind of this type, its
613 variable-length data and whether it is visible to name lookup. See @xref{The
617 @tab @code{uint32_t ctt_size}
619 @vindex struct ctf_type, ctt_size
620 @vindex ctf_type_t, ctt_size
621 @vindex struct ctf_stype, ctt_size
622 @vindex ctf_stype_t, ctt_size
623 @tab The size of this type, if this type is of a kind for which a size needs
624 to be recorded (constant-size types don't need one). If this is
625 @code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}.
628 @tab @code{uint32_t ctt_type}
630 @vindex struct ctf_stype, ctt_type
631 @vindex ctf_stype_t, ctt_type
632 @tab The type this type refers to, if this type is of a kind which refers to
633 other types (like a pointer). All such types are fixed-size, and no types that
634 are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type}
635 overlap. All type kinds that use @code{ctt_type} are described by
636 @code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}.
638 @item 0x0c (@code{ctf_type_t} only)
639 @tab @code{uint32_t ctt_lsizehi}
641 @vindex struct ctf_type, ctt_lsizehi
642 @vindex ctf_type_t, ctt_lsizehi
643 @tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro
644 can be used to get a 64-bit size out of this field and the next one.
645 @code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again.
646 @findex CTF_TYPE_LSIZE
647 @findex CTF_SIZE_TO_LSIZE_HI
649 @item 0x10 (@code{ctf_type_t} only)
650 @tab @code{uint32_t ctt_lsizelo}
652 @vindex struct ctf_type, ctt_lsizelo
653 @vindex ctf_type_t, ctt_lsizelo
654 @tab The low 32 bits of the size of a very large type.
655 @code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size.
656 @findex CTF_SIZE_TO_LSIZE_LO
659 Two aspects of this need further explanation: the info word, and what exactly a
660 type ID is and how you determine it. (Information on the various type-kind-
661 dependent things, like whether @code{ctt_size} or @code{ctt_type} is used,
662 is described in the section devoted to each kind.)
665 @subsection The info word, ctt_info
667 The info word is a bitfield split into three parts. From MSB to LSB:
669 @multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).}
670 @headitem Bit offset @tab Name @tab Description
673 @tab Type kind: @pxref{Type kinds}.
677 @tab 1 if this type is visible to name lookup
681 @tab Length of variable-length data for this type (some kinds only).
682 The variable-length data directly follows the @code{ctf_type_t} or
683 @code{ctf_stype_t}. This is a kind-dependent array length value,
684 not a length in bytes. Some kinds have no variable-length data, or
685 fixed-size variable-length data, and do not use this value.
688 The most mysterious of these is undoubtedly @code{isroot}. This indicates
689 whether types with names (nonzero @code{ctt_name}) are visible to name lookup:
690 if zero, this type is considered a @dfn{non-root type} and you can't look it up
691 by name at all. Multiple types with the same name in the same C namespace
692 (struct, union, enum, other) can exist in a single dictionary, but only one of
693 them may have a nonzero value for @code{isroot}. @code{libctf} validates this
694 at open time and refuses to open dictionaries that violate this constraint.
696 Historically, this feature was introduced for the encoding of bitfields
697 (@pxref{Integer types}): for instance, int bitfields will all be named
698 @code{int} with different widths or offsets, but only the full-width one at
699 offset zero is wanted when you look up the type named @code{int}. With the
700 introduction of slices (@pxref{Slices}) as a more general bitfield encoding
701 mechanism, this is less important, but we still use non-root types to handle
702 conflicts if the linker API is used to fuse multiple translation units into one
703 dictionary and those translation units contain types with the same name and
704 conflicting definitions. (We do not discuss this further here, because the
705 linker never does this: only specialized type mergers do, like that used for the
706 Linux kernel. The libctf documentation will describe this in more detail.)
707 @c XXX update when libctf docs are written.
709 The @code{CTF_TYPE_INFO} macro can be used to compose an info word from
710 a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND},
711 @code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again.
712 @findex CTF_TYPE_INFO
713 @findex CTF_V2_INFO_KIND
714 @findex CTF_V2_INFO_ISROOT
715 @findex CTF_V2_INFO_VLEN
717 @node Type indexes and type IDs
718 @subsection Type indexes and type IDs
722 @cindex Type, indexes of
727 @cindex Type IDs, ranges
728 Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a
729 number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1}
730 and below are in the @dfn{parent range}: these IDs are used for dictionaries
731 that have not had any other dictionary @code{ctf_import}ed into it as a parent.
732 Both completely standalone dictionaries and parent dictionaries with children
733 hanging off them have types in this range. Types @math{2^31} and above are in
734 the @dfn{child range}: only types in child dictionaries are in this range.
736 These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but
737 the types themselves have no visible ID: quite intentionally, because adding an
738 ID uses space, and every ID is different so they don't compress well. The IDs
739 are implicit: at open time, the consumer walks through the entire type section
740 and counts the types in the type section. The type section is an array of
741 variable-length elements, so each entry could be considered as having an index,
742 starting from 1. We count these indexes and associate each with its
743 corresponding @code{ctf_type_t} or @code{ctf_stype_t}.
745 Lookups of types with IDs in the parent space look in the parent dictionary if
746 this dictionary has one associated with it; lookups of types with IDs in the
747 child space error out if the dictionary does not have a parent, and otherwise
748 convert the ID into an index by shaving off the top bit and look up the index
751 These properties mean that the same dictionary can be used as a parent of child
752 dictionaries and can also be used directly with no children at all, but a
753 dictionary created as a child dictionary must always be associated with a parent
754 --- usually, the same parent --- because its references to its own types have
755 the high bit turned on and this is only flipped off again if this is a child
756 dictionary. (This is not a problem, because if you @emph{don't} associate the
757 child with a parent, any references within it to its parent types will fail, and
758 there are almost certain to be many such references, or why is it a child at
761 This does mean that consumers should keep a close eye on the distinction between
762 type IDs and type indexes: if you mix them up, everything will appear to work as
763 long as you're only using parent dictionaries or standalone dictionaries, but as
764 soon as you start using children, everything will fail horribly.
766 Type index zero, and type ID zero, are used to indicate that this type cannot be
767 represented in CTF as currently constituted: they are emitted by the compiler,
768 but all type chains that terminate in the unknown type are erased at link time
769 (structure fields that use them just vanish, etc). So you will probably never
770 see a use of type zero outside the symtypetab sections, where they serve as
771 sentinels of sorts, to indicate symbols with no associated type.
773 The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help
774 in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and
775 @code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the
776 parent or child range.
777 @findex CTF_V2_TYPE_TO_INDEX
778 @findex CTF_V2_INDEX_TO_TYPE
779 @findex CTF_V2_TYPE_ISPARENT
780 @findex CTF_V2_TYPE_ISCHILD
782 It is quite possible and indeed common for type IDs to point forward in the
783 dictionary, as well as backward.
786 @subsection Type kinds
788 @cindex Type, kinds of
790 Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type:
791 all structures are a single kind, as are all unions, all pointers, all arrays,
792 all integers regardless of their bitfield width, etc. The kind of a type is
793 given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info
796 The space of type kinds is only a quarter full so far, so there is plenty of
797 room for expansion. It is likely that in future versions of the file format,
798 types with smaller kinds will be more efficiently encoded than types with larger
799 kinds, so their numerical value will actually start to matter in future. (So
800 these IDs will probably change their numerical values in a later release of this
801 format, to move more frequently-used kinds like structures and cv-quals towards
802 the top of the space, and move rarely-used kinds like integers downwards. Yes,
803 integers are rare: how many kinds of @code{int} are there in a program? They're
804 just very frequently @emph{referenced}.)
806 Here's the set of kinds so far. Each kind has a @code{#define} associated with
809 @multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}}
810 @headitem Kind @tab Macro @tab Purpose
812 @tab @code{CTF_K_UNKNOWN}
813 @tab Indicates a type that cannot be represented in CTF, or that is being skipped.
814 It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types
815 of kind @code{CTF_K_UNKNOWN}.
816 @tindex CTF_K_UNKNOWN
819 @tab @code{CTF_K_INTEGER}
820 @tab An integer type. @xref{Integer types}.
823 @tab @code{CTF_K_FLOAT}
824 @tab A floating-point type. @xref{Floating-point types}.
827 @tab @code{CTF_K_POINTER}
828 @tab A pointer. @xref{Pointers typedefs and cvr-quals}.
831 @tab @code{CTF_K_ARRAY}
832 @tab An array. @xref{Arrays}.
835 @tab @code{CTF_K_FUNCTION}
836 @tab A function pointer. @xref{Function pointers}.
839 @tab @code{CTF_K_STRUCT}
840 @tab A structure. @xref{Structs and unions}.
843 @tab @code{CTF_K_UNION}
844 @tab A union. @xref{Structs and unions}.
847 @tab @code{CTF_K_ENUM}
848 @tab An enumerated type. @xref{Enums}.
851 @tab @code{CTF_K_FORWARD}
852 @tab A forward. @xref{Forward declarations}.
855 @tab @code{CTF_K_TYPEDEF}
856 @tab A typedef. @xref{Pointers typedefs and cvr-quals}.
859 @tab @code{CTF_K_VOLATILE}
860 @tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}.
863 @tab @code{CTF_K_CONST}
864 @tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}.
867 @tab @code{CTF_K_RESTRICT}
868 @tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}.
871 @tab @code{CTF_K_SLICE}
872 @tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}.
875 Now we cover all type kinds in turn. Some are more complicated than others.
878 @subsection Integer types
879 @cindex Integer types
880 @cindex Types, integer
888 @tindex unsigned long
889 @tindex unsigned long long
890 @tindex unsigned short
891 @tindex unsigned char
894 @tindex signed long long
897 @cindex CTF_K_INTEGER
899 Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These
900 types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes
901 of the integral type in question. They are always represented by
902 @code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one
903 @code{uint32_t} in length: @code{vlen} in the info word should be disregarded
906 The variable-length data for integers has multiple items packed into it much
907 like the info word does.
909 @multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.}
910 @headitem Bit offset @tab Name @tab Description
913 @tab The desired display representation of this integer. You can extract this
914 field with the @code{CTF_INT_ENCODING} macro. See below.
915 @findex CTF_INT_ENCODING
919 @tab The offset of this integral type in bits from the start of its enclosing
920 structure field, adjusted for endianness: @pxref{Structs and unions}. You can
921 extract this field with the @code{CTF_INT_OFFSET} macro.
922 @findex CTF_INT_OFFSET
926 @tab The width of this integral type in bits. You can extract this field with
927 the @code{CTF_INT_BITS} macro.
931 If you choose, bitfields can be represented using the things above as a sort of
932 integral type with the @code{isroot} bit flipped off and the offset and bits
933 values set in the vlen word: you can populate it with the @code{CTF_INT_DATA}
934 macro. (But it may be more convenient to represent them using slices of a
935 full-width integer: @pxref{Slices}.)
938 Integers that are bitfields usually have a @code{ctt_size} rounded up to the
939 nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer
940 would have a @code{ctt_size} of 4). However, not all types are naturally
941 aligned on all architectures: packed structures may in theory use integral
942 bitfields with different @code{ctt_size}, though this is rarely observed.
944 The @dfn{encoding} for integers is a bit-field comprised of the values below,
945 which consumers can use to decide how to display values of this type:
947 @multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned}
948 @headitem Offset @tab Name @tab Description
950 @tab @code{CTF_INT_SIGNED}
951 @tab If set, this is a signed int: if false, unsigned.
952 @tindex CTF_INT_SIGNED
955 @tab @code{CTF_INT_CHAR}
956 @tab If set, this is a char type. It is platform-dependent whether unadorned
957 @code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral
958 type suitable for the definition of @code{char} on this platform.
963 @tab @code{CTF_INT_BOOL}
964 @tab If set, this is a boolean type. (It is theoretically possible to turn this
965 and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would
970 @tab @code{CTF_INT_VARARGS}
971 @tab If set, this is a varargs-promoted value in a K&R function definition.
972 This is not currently produced or consumed by anything that we know of: it is set
973 aside for future use.
976 The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported:
977 references to such types will be emitted as type 0.
979 @node Floating-point types
980 @subsection Floating-point types
981 @cindex Floating-point types
982 @cindex Types, floating-point
986 @tindex signed double
987 @tindex unsigned float
988 @tindex unsigned double
989 @tindex Complex, float
990 @tindex Complex, double
991 @tindex Complex, signed float
992 @tindex Complex, signed double
993 @tindex Complex, unsigned float
994 @tindex Complex, unsigned double
997 Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}.
998 Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t}
999 with the size in bytes of the floating-point type in question. They are always
1000 represented by @code{ctf_stype_t}, never @code{ctf_type_t}.
1002 This part of CTF shows many rough edges in the more obscure corners of
1003 floating-point handling, and is likely to change in format v4.
1005 The variable-length data for floats has multiple items packed into it just like
1008 @multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.}
1009 @headitem Bit offset @tab Name @tab Description
1012 @tab The desired display representation of this float. You can extract this
1013 field with the @code{CTF_FP_ENCODING} macro. See below.
1014 @findex CTF_FP_ENCODING
1018 @tab The offset of this floating-point type in bits from the start of its enclosing
1019 structure field, adjusted for endianness: @pxref{Structs and unions}. You can
1020 extract this field with the @code{CTF_FP_OFFSET} macro.
1021 @findex CTF_FP_OFFSET
1025 @tab The width of this floating-point type in bits. You can extract this field with
1026 the @code{CTF_FP_BITS} macro.
1030 The purpose of the floating-point offset and bit-width is somewhat opaque, since
1031 there are no such things as floating-point bitfields in C: the bit-width should
1032 be filled out with the full width of the type in bits, and the offset should
1033 always be zero. It is likely that these fields will go away in the future. As
1034 with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen
1035 items from its component parts.
1036 @findex CTF_INT_DATA
1038 The @dfn{encoding} for floats is not a bitfield but a simple value indicating
1039 the display representation. Many of these are unused, relate to
1040 Solaris-specific compiler extensions, and will be recycled in future: some are
1041 unused and will become used in future.
1043 @multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.}
1044 @headitem Offset @tab Name @tab Description
1046 @tab @code{CTF_FP_SINGLE}
1047 @tab This is a single-precision IEEE 754 @code{float}.
1048 @tindex CTF_FP_SINGLE
1050 @tab @code{CTF_FP_DOUBLE}
1051 @tab This is a double-precision IEEE 754 @code{double}.
1052 @tindex CTF_FP_DOUBLE
1054 @tab @code{CTF_FP_CPLX}
1055 @tab This is a @code{Complex float}.
1058 @tab @code{CTF_FP_DCPLX}
1059 @tab This is a @code{Complex double}.
1060 @tindex CTF_FP_DCPLX
1062 @tab @code{CTF_FP_LDCPLX}
1063 @tab This is a @code{Complex long double}.
1064 @tindex CTF_FP_LDCPLX
1066 @tab @code{CTF_FP_LDOUBLE}
1067 @tab This is a @code{long double}.
1068 @tindex CTF_FP_LDOUBLE
1070 @tab @code{CTF_FP_INTRVL}
1071 @tab This is a @code{float} interval type, a Solaris-specific extension.
1072 Unused: will be recycled.
1073 @tindex CTF_FP_INTRVL
1076 @tab @code{CTF_FP_DINTRVL}
1077 @tab This is a @code{double} interval type, a Solaris-specific extension.
1078 Unused: will be recycled.
1079 @tindex CTF_FP_DINTRVL
1082 @tab @code{CTF_FP_LDINTRVL}
1083 @tab This is a @code{long double} interval type, a Solaris-specific extension.
1084 Unused: will be recycled.
1085 @tindex CTF_FP_LDINTRVL
1088 @tab @code{CTF_FP_IMAGRY}
1089 @tab This is a the imaginary part of a @code{Complex float}. Not currently
1090 generated. May change.
1091 @tindex CTF_FP_IMAGRY
1094 @tab @code{CTF_FP_DIMAGRY}
1095 @tab This is a the imaginary part of a @code{Complex double}. Not currently
1096 generated. May change.
1097 @tindex CTF_FP_DIMAGRY
1100 @tab @code{CTF_FP_LDIMAGRY}
1101 @tab This is a the imaginary part of a @code{Complex long double}. Not currently
1102 generated. May change.
1103 @tindex CTF_FP_LDIMAGRY
1107 The use of the complex floating-point encodings is obscure: it is possible that
1108 @code{CTF_FP_CPLX} is meant to be used for only the real part of complex types,
1109 and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are
1110 emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its
1111 constituent parts. There appear to be no uses of these encodings anywhere, so
1112 they are quite likely to change incompatibly in future.
1117 @cindex Types, slices of integral
1120 Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not
1121 directly correspond to any C type, but are a way to model other types in a more
1122 convenient fashion for CTF generators.
1124 A slice is like a pointer or other reference type in that they are always
1125 represented by @code{ctf_stype_t}: but unlike pointers and other reference
1126 types, they populate the @code{ctt_size} field just like integral types do, and
1127 come with an attached encoding and transform the encoding of the underlying
1128 type. The underlying type is described in the variable-length data, similarly
1129 to structure and union fields: see below. Requests for the type size should
1130 also chase down to the referenced type.
1132 Slices are always nameless: @code{ctt_name} is always zero for them.
1134 (The @code{libctf} API behaviour is unusual as well, and justifies the existence
1135 of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the
1136 underlying type kind, so that consumers never need to know about slices: they
1137 can tell if an apparent integer is actually a slice if they need to by calling
1138 @code{ctf_type_reference}, which will uniquely return the underlying integral
1139 type rather than erroring out with @code{ECTF_NOTREF} if this is actually a
1140 slice. So slices act just like an integer with an encoding, but more closely
1141 mirror DWARF and other debugging information formats by allowing CTF file
1142 creators to represent a bitfield as a slice of an underlying integral type.)
1143 @findex Slices, effect on ctf_type_kind
1144 @findex Slices, effect on ctf_type_reference
1145 @findex libctf, effect of slices
1147 The vlen in the info word for a slice should be ignored and is always zero. The
1148 variable-length data for a slice is a single @code{ctf_slice_t}:
1151 typedef struct ctf_slice
1154 unsigned short cts_offset;
1155 unsigned short cts_bits;
1159 @tindex struct ctf_slice
1161 @multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an}
1162 @headitem Offset @tab Name @tab Description
1164 @tab @code{uint32_t cts_type}
1166 @vindex struct ctf_slice, cts_type
1167 @vindex ctf_slice_t, cts_type
1168 @tab The type this slice is a slice of. Must be an integral type (or a
1169 floating-point type, but this nonsensical option will go away in v4.)
1172 @tab @code{unsigned short cts_offset}
1174 @vindex struct ctf_slice, cts_offset
1175 @vindex ctf_slice_t, cts_offset
1176 @tab The offset of this integral type in bits from the start of its enclosing
1177 structure field, adjusted for endianness: @pxref{Structs and unions}. Identical
1178 semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field
1179 is much too long, because the maximum possible offset of an integral type would
1180 easily fit in a char: this field is bigger just for the sake of alignment. This
1184 @tab @code{unsigned short cts_bits}
1186 @vindex struct ctf_slice, cts_bits
1187 @vindex ctf_slice_t, cts_bits
1188 @tab The bit-width of this integral type. Identical semantics to the
1189 @code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is
1190 really too large and will shrink in v4.
1193 @node Pointers typedefs and cvr-quals
1194 @subsection Pointers, typedefs, and cvr-quals
1202 @tindex CTF_K_POINTER
1203 @tindex CTF_K_TYPEDEF
1205 @tindex CTF_K_VOLATILE
1206 @tindex CTF_K_RESTRICT
1208 Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict}
1209 qualifiers are represented identically except for their type kind (though they
1210 may be treated differently by consuming libraries like @code{libctf}, since
1211 pointers affect assignment-compatibility in ways cvr-quals do not, and they may
1212 have different alignment requirements, etc).
1214 All of these are represented by @code{ctf_stype_t}, have no variable data at
1215 all, and populate @code{ctt_type} with the type ID of the type they point
1216 to. These types can stack: a @code{CTF_K_RESTRICT} can point to a
1217 @code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc.
1219 They are all unnamed: @code{ctt_name} is 0.
1221 The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data
1222 models}), i.e. in practice, from the target machine ABI, and is not explicitly
1223 represented. The size of other kinds in this set should be determined by
1224 chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is
1225 found, and using that.
1231 Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}.
1232 Both size and kind for arrays are zero. The variable-length data is a
1233 @code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is
1237 typedef struct ctf_array
1239 uint32_t cta_contents;
1241 uint32_t cta_nelems;
1245 @tindex struct ctf_array
1247 @multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an}
1248 @headitem Offset @tab Name @tab Description
1250 @tab @code{uint32_t cta_contents}
1251 @vindex cta_contents
1252 @vindex struct ctf_array, cta_contents
1253 @vindex ctf_array_t, cta_contents
1254 @tab The type of the array elements: a type ID.
1257 @tab @code{uint32_t cta_index}
1259 @vindex struct ctf_array, cta_index
1260 @vindex ctf_array_t, cta_index
1261 @tab The type of the array index: a type ID of an integral type.
1262 If this is a variable-length array, the index type ID will be 0
1263 (but the actual index type of this array is probably @code{int}).
1264 Probably redundant and may be dropped in v4.
1267 @tab @code{uint32_t cta_nelems}
1269 @vindex struct ctf_array, cta_nelems
1270 @vindex ctf_array_t, cta_nelems
1271 @tab The number of array elements. 0 for VLAs, and also for
1272 the historical variety of VLA which has explicit zero dimensions (which will
1273 have a nonzero @code{cta_index}.)
1276 The size of an array can be computed by simple multiplication of the size of the
1277 @code{cta_contents} type by the @code{cta_nelems}.
1279 @node Function pointers
1280 @subsection Function pointers
1281 @cindex Function pointers
1282 @cindex Pointers, to functions
1284 Function pointers are explicitly represented in the CTF type section by a type
1285 of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The
1286 @code{ctt_type} is the function return type ID. The @code{vlen} in the info
1287 word is the number of arguments, each of which is a type ID, a @code{uint32_t}:
1288 if the last argument is 0, this is a varargs function and the number of
1289 arguments is one less than indicated by the vlen.
1291 If the number of arguments is odd, a single @code{uint32_t} of padding is
1292 inserted to maintain alignment.
1300 Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a
1301 @code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the
1302 data model (enum bitfields are implemented via slices). The @code{vlen} is a
1303 count of enumerations, each of which is represented by a @code{ctf_enum_t} in
1307 typedef struct ctf_enum
1314 @tindex struct ctf_enum
1316 @multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.}
1317 @headitem Offset @tab Name @tab Description
1319 @tab @code{uint32_t cte_name}
1321 @vindex struct ctf_enum, cte_name
1322 @vindex ctf_enum_t, cte_name
1323 @tab Strtab offset of the enumeration name. Must not be 0.
1326 @tab @code{int32_t cte_value}
1328 @vindex struct ctf_enum, cte_value
1329 @vindex ctf_enum_t, cte_value
1330 @tab The enumeration value.
1334 Enumeration values larger than @math{2^32} are not yet supported and are omitted
1335 from the enumeration. (v4 will lift this restriction by encoding the value
1338 Forward declarations of enums are not implemented with this kind: @pxref{Forward
1341 Enumerated type names, as usual in C, go into their own namespace, and do not
1342 conflict with non-enums, structs, or unions with the same name.
1344 @node Structs and unions
1345 @subsection Structs and unions
1350 @tindex CTF_K_STRUCT
1353 Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and
1354 @code{CTF_K_UNION}: their representation is otherwise identical, and it is
1355 perfectly allowed for ``structs'' to contain overlapping fields etc, so we will
1356 treat them together for the rest of this section.
1358 They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to
1359 @code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE}
1361 @tindex CTF_MAX_LSIZE
1363 The vlen for structures and unions is a count of structure fields, but the type
1364 used to represent a structure field (and thus the size of the variable-length
1365 array element representing the type) depends on the size of the structure: truly
1366 huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a
1367 different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are
1368 vanishingly rare: in v4, this representation will change somewhat for greater
1369 compactness. It's inherited from v1, where the limits were much lower.)
1370 @tindex CTF_LSTRUCT_THRESH
1372 Most structures can get away with using @code{ctf_member_t}:
1375 typedef struct ctf_member_v2
1378 uint32_t ctm_offset;
1383 Huge structures that are represented by @code{ctf_type_t} rather than
1384 @code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as
1385 @code{ctf_type_t} splits the size:
1388 typedef struct ctf_lmember_v2
1391 uint32_t ctlm_offsethi;
1393 uint32_t ctlm_offsetlo;
1397 Here's what the fields of @code{ctf_member} mean:
1399 @tindex struct ctf_member_v2
1400 @tindex ctf_member_t
1401 @multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
1402 @headitem Offset @tab Name @tab Description
1404 @tab @code{uint32_t ctm_name}
1406 @vindex struct ctf_member_v2, ctm_name
1407 @vindex ctf_member_t, ctm_name
1408 @tab Strtab offset of the field name.
1411 @tab @code{uint32_t ctm_offset}
1413 @vindex struct ctf_member_v2, ctm_offset
1414 @vindex ctf_member_t, ctm_offset
1415 @tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is
1416 machine-word-aligned and the individual field has an offset in bits, but
1417 the format allows for the offset to be encoded in bits here.)
1420 @tab @code{uint32_t ctm_type}
1422 @vindex struct ctf_member_v2, ctm_type
1423 @vindex ctf_member_t, ctm_type
1424 @tab The type ID of the type of the field.
1427 Here's what the fields of the very similar @code{ctf_lmember} mean:
1429 @tindex struct ctf_lmember_v2
1430 @tindex ctf_lmember_t
1431 @multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
1432 @headitem Offset @tab Name @tab Description
1434 @tab @code{uint32_t ctlm_name}
1436 @vindex struct ctf_lmember_v2, ctlm_name
1437 @vindex ctf_lmember_t, ctlm_name
1438 @tab Strtab offset of the field name.
1441 @tab @code{uint32_t ctlm_offsethi}
1442 @vindex ctlm_offsethi
1443 @vindex struct ctf_lmember_v2, ctlm_offsethi
1444 @vindex ctf_lmember_t, ctlm_offsethi
1445 @tab The high 32 bits of the offset of this field in bits.
1448 @tab @code{uint32_t ctlm_type}
1450 @vindex struct ctf_lmember_v2, ctlm_type
1451 @vindex ctf_member_t, ctlm_type
1452 @tab The type ID of the type of the field.
1455 @tab @code{uint32_t ctlm_offsetlo}
1456 @vindex ctlm_offsetlo
1457 @vindex struct ctf_lmember_v2, ctlm_offsetlo
1458 @vindex ctf_lmember_t, ctlm_offsetlo
1459 @tab The low 32 bits of the offset of this field in bits.
1462 Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and
1463 @code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the
1464 @code{ctlm_offset} fields, much as with the split size fields in
1467 Unnamed structure and union fields are simply implemented by collapsing the
1468 unnamed field's members into the containing structure or union: this does mean
1469 that a structure containing an unnamed union can end up being a ``structure''
1470 with multiple members at the same offset. (A future format revision may
1471 collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and
1472 decide among them based on whether their members do in fact overlap.)
1474 Structure and union type names, as usual in C, go into their own namespace,
1475 just as enum type names do.
1477 Forward declarations of structures and unions are not implemented with this
1478 kind: @pxref{Forward declarations}.
1480 @node Forward declarations
1481 @subsection Forward declarations
1486 @tindex CTF_K_FORWARD
1488 When the compiler encounters a forward declaration of a struct, union, or enum,
1489 it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non-
1490 forward declaration of the same thing, it marks the forward as non-root-visible:
1491 before link time, therefore, non-root-visible forwards indicate that a
1492 non-forward is coming.
1494 After link time, forwards are fused with their corresponding non-forwards by the
1495 deduplicator where possible. They are kept if there is no non-forward
1496 definition (maybe it's not visible from any TU at all) or if @code{multiple}
1497 conflicting structures with the same name might match it. Otherwise, all other
1498 forwards are converted to structures, unions, or enums as appropriate, even
1499 across TUs if only one structure could correspond to the forward (after all,
1500 all types across all TUs land in the same dictionary unless they conflict,
1501 so promoting forwards to their concrete type seems most helpful).
1503 A forward has a rather strange representation: it is encoded with a
1504 @code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's
1505 a forward, we don't have an underlying type yet: if we did, we'd have promoted
1506 it and this wouldn't be a forward any more) but with the @code{kind} of the
1507 forward. This means that we can distinguish forwards to structs, enums and
1508 unions reliably and ensure they land in the appropriate namespace even before
1509 the actual struct, union or enum is found.
1511 @node The symtypetab sections
1512 @section The symtypetab sections
1513 @cindex Symtypetab section
1514 @cindex Sections, symtypetab
1515 @cindex Function info section
1516 @cindex Sections, function info
1517 @cindex Data object section
1518 @cindex Sections, data object
1519 @cindex Function info index section
1520 @cindex Sections, function info index
1521 @cindex Data object index section
1522 @cindex Sections, data object index
1523 @tindex CTF_F_IDXSORTED
1524 @tindex CTF_F_DYNSTR
1525 @cindex Bug workarounds, CTF_F_DYNSTR
1527 These are two very simple sections with identical formats, used by consumers to
1528 map from ELF function and data symbols directly to their types. So they are
1529 usually populated only in CTF sections that are embedded in ELF objects.
1531 Their format is very simple: an array of type IDs. Which symbol each type ID
1532 corresponds to depends on whether the optional @emph{index section} associated
1533 with this symtypetab section has any content.
1535 If the index section is nonempty, it is an array of @code{uint32_t} string table
1536 offsets, each giving the name of the symbol whose type is at the same offset in
1537 the corresponding non-index section: users can look up symbols in such a table
1538 by name. The index section and corresponding symtypetab section is usually
1539 ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the
1540 header): if it's sorted, it can be bsearched for a symbol name rather than
1541 having to use a slower linear search.
1543 If the data object index section is empty, the entries in the data object and
1544 function info sections are associated 1:1 with ELF symbols of type
1545 @code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with
1546 a nonzero value: the linker shuffles the symtypetab sections to correspond with
1547 the order of the symbols in the ELF file. Symbols with no name, undefined
1548 symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped
1549 and never appear in either section. Symbols that have no corresponding type are
1550 represented by type ID 0. The section may have fewer entries than the symbol
1551 table, in which case no later entries have associated types. This format is
1552 more compact than an indexed form if most entries have types (since there is no
1553 need to record any symbol names), but if the producer and consumer disagree even
1554 slightly about which symbols are omitted, the types of all further symbols will
1557 The compiler always emits indexed symtypetab tables, because there is no symbol
1558 table yet. The linker will always have to read them all in and always works
1559 through them from start to end, so there is no benefit having the compiler sort
1560 them either. The linker (actually, @code{libctf}'s linking machinery) will
1561 automatically sort unsorted indexed sections, and convert indexed sections that
1562 contain a lot of pads into the more compact, unindexed form.
1564 If child dicts are in use, only symbols that use types actually mentioned in the
1565 child appear in the child's symtypetab: symbols that use only types in the
1566 parent appear in the parent's symtypetab instead. So the child's symtypetab will
1567 almost always be very sparse, and thus will usually use the indexed form even in
1568 fully linked objects. (It is, of course, impossible for symbols to exist that
1569 use types from multiple child dicts at once, since it's impossible to declare a
1570 function in C that uses types that are only visible in two different, disjoint
1573 @node The variable section
1574 @section The variable section
1575 @cindex Variable section
1576 @cindex Sections, variable
1578 The variable section is a simple array mapping names (strtab entries) to type
1579 IDs, intended to provide a replacement for the data object section in dynamic
1580 situations in which there is no static ELF strtab but the consumer instead hands
1581 back names. The section is sorted into ASCIIbetical order by name for rapid
1582 lookup, like the CTF archive name table.
1584 The section is an array of these structures:
1587 typedef struct ctf_varent
1594 @tindex struct ctf_varent
1595 @tindex ctf_varent_t
1596 @multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name}
1597 @headitem Offset @tab Name @tab Description
1599 @tab @code{uint32_t ctv_name}
1601 @vindex struct ctf_varent, ctv_name
1602 @vindex ctf_varent_t, ctv_name
1603 @tab Strtab offset of the name
1606 @tab @code{uint32_t ctv_type}
1608 @vindex struct ctf_varent, ctv_type
1609 @vindex ctf_varent_t, ctv_type
1610 @tab Type ID of this type
1613 There is no analogue of the function info section yet: v4 will probably drop
1614 this section in favour of a way to put both indexed (thus, named) and nonindexed
1615 symbols into the symtypetab sections at the same time.
1617 @node The label section
1618 @section The label section
1619 @cindex Label section
1620 @cindex Sections, label
1622 The label section is a currently-unused facility allowing the tiling of the type
1623 space with names taken from the strtab. The section is an array of these
1627 typedef struct ctf_lblent
1634 @tindex struct ctf_lblent
1635 @tindex ctf_lblent_t
1636 @multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label}
1637 @headitem Offset @tab Name @tab Description
1639 @tab @code{uint32_t ctl_label}
1641 @vindex struct ctf_lblent, ctl_label
1642 @vindex ctf_lblent_t, ctl_label
1643 @tab Strtab offset of the label
1646 @tab @code{uint32_t ctl_type}
1648 @vindex struct ctf_lblent, ctl_type
1649 @vindex ctf_lblent_t, ctl_type
1650 @tab Type ID of the last type covered by this label
1653 Semantics will be attached to labels soon, probably in v4 (the plan is to use
1654 them to allow multiple disjoint namespaces in a single CTF file, removing many
1655 uses of CTF archives, in particular in the @code{.ctf} section in ELF objects).
1657 @node The string section
1658 @section The string section
1659 @cindex String section
1660 @cindex Sections, string
1662 This section is a simple ELF-format strtab, starting with a zero byte (thus
1663 ensuring that the string with offset 0 is the null string, as assumed elsewhere
1664 in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve
1665 compression efficiency.
1667 Where the strtab is unusual is the @emph{references} to it. CTF has two
1668 string tables, the internal strtab and an external strtab associated
1669 with the CTF dictionary at open time: usually, this is the ELF dynamic
1670 strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We
1671 distinguish between these strtabs by the most significant bit, bit 31,
1672 of the 32-bit strtab references: if it is 0, the offset is in the
1673 internal strtab: if 1, the offset is in the external strtab.
1675 @tindex CTF_F_DYNSTR
1676 @cindex Bug workarounds, CTF_F_DYNSTR
1677 There is a bug workaround in this area: in format v3 (the first version
1678 to have working support for external strtabs), the external strtab is
1679 @code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the
1680 dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a
1681 header field that explicitly names the external strtab, making this flag
1685 @section Data models
1688 The data model is a simple integer which indicates the ABI in use on this
1689 platform. Right now, it is very simple, distinguishing only between 32- and
1690 64-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from
1691 ABI integer to type sizes is hardwired into @code{libctf}: currently, we use
1692 this to hardwire the size of pointers, function pointers, and enumerated types,
1694 This is a very kludgy corner of CTF and will probably be replaced with explicit
1695 header fields to record this sort of thing in future.
1698 @section Limits of CTF
1701 The following limits are imposed by various aspects of CTF version 3:
1705 Maximum type identifier (maximum number of types accessible with parent and
1706 child containers in use): 0xfffffffe
1708 Maximum type identifier in a parent dictioanry: maximum number of types in any
1709 one dictionary: 0x7fffffff
1711 Maximum offset into a string table: 0x7fffffff
1713 Maximum number of members in a struct, union, or enum: maximum number of
1714 function args: 0xffffff
1716 Maximum size of a @code{ctf_stype_t} in bytes before we fall back to
1717 @code{ctf_type_t}: 0xfffffffe bytes
1720 Other maxima without associated macros:
1723 Maximum value of an enumerated type: 2^32
1725 Maximum size of an array element: 2^32
1728 These maxima are generally considered to be too low, because C programs can and
1729 do exceed them: they will be lifted in format v4.