2 .\" This file and its contents are supplied under the terms of the
3 .\" Common Development and Distribution License ("CDDL"), version 1.0.
4 .\" You may only use this file in accordance with the terms of version
7 .\" A full copy of the text of the CDDL should have accompanied this
8 .\" source. A copy of the CDDL is also available via the Internet at
9 .\" http://www.illumos.org/license/CDDL.
12 .\" Copyright (c) 2014 Joyent, Inc.
19 .Nd Compact C Type Format
24 is designed to be a compact representation of the C programming
25 language's type information focused on serving the needs of dynamic
26 tracing, debuggers, and other in-situ and post-mortem introspection
29 data is generally included in
31 objects and is tagged as
33 to ensure that the data is accessible in a running process and in subsequent
34 core dumps, if generated.
38 data contained in each file has information about the layout and
39 sizes of C types, including intrinsic types, enumerations, structures,
40 typedefs, and unions, that are used by the corresponding
45 data may also include information about the types of global objects and
46 the return type and arguments of functions in the symbol table.
50 file is often embedded inside a file, rather than being a standalone
51 file itself, it may also be referred to as a
57 data is consumed by multiple programs.
58 It can be used by the modular debugger,
62 Programmatic access to
64 data can be obtained through
69 file format is broken down into seven different sections.
70 The first section is the
74 which describes the version of the
76 file, links it has to other
78 files, and the sizes of the other sections.
79 The next section is the
82 which provides a way of identifying similar groups of
84 data across multiple files.
85 This is followed by the
87 information section, which describes the type of global
89 The subsequent section is the
91 information section, which describes the return
92 types and arguments of functions.
93 The next section is the
95 information section, which describes
96 the format and layout of the C types themselves, and finally the last
99 section, which contains the names of types, enumerations, members, and
102 While strictly speaking, only the
106 are required, to be actually useful, both the type and string
107 sections are necessary.
111 file may contain all of the type information that it requires, or it
112 may optionally refer to another
114 file which holds the remaining types.
117 file refers to another file, it is called the
119 and the file it refers to is called the
121 A given file may only refer to one parent.
122 This process is called
124 because it ensures each child only has type information that is
126 A common example of this is that most kernel modules in illumos are uniquified
127 against the kernel module
129 and the type information that comes from the
132 This means that a module only has types that are unique to itself and the most
133 common types in the kernel are not duplicated.
135 This documents version
140 All applications and tools currently produce and operate on this version.
142 The file format can be summarized with the following image, the
143 following sections will cover this in more detail.
147 +--------| Preamble |
148 | +-------------+ 0t4
150 || +-------------+ 0t36 + cth_lbloff
152 ||| +-------------+ 0t36 + cth_objtoff
154 |||| +-------------+ 0t36 + cth_funcoff
155 ||||+----| Functions |
156 ||||| +-------------+ 0t36 + cth_typeoff
158 |||||| +-------------+ 0t36 + cth_stroff
160 ||||||| +-------------+ 0t36 + cth_stroff + cth_strlen
164 ||||||| +-- magic - vers flags
166 ||||||| +------+------+------+------+
167 +---------| 0xcf | 0xf1 | 0x02 | 0x00 |
168 |||||| +------+------+------+------+
171 |||||| + parent label + objects
172 |||||| | + parent name | + functions + strings
173 |||||| | | + label | | + types | + strlen
174 |||||| | | | | | | | |
175 |||||| +------+------+------+------+------+-------+-------+-------+
176 +--------| 0x00 | 0x00 | 0x00 | 0x08 | 0x36 | 0x110 | 0x5f4 | 0x611 |
177 ||||| +------+------+------+------+------+-------+-------+-------+
178 ||||| 0x04 0x08 0x0c 0x10 0x14 0x18 0x1c 0x20 0x24
182 ||||| | | + Next label
184 ||||| +-------+------+-----+
185 +-----------| 0x01 | 0x42 | ... |
186 |||| +-------+------+-----+
187 |||| cth_lbloff +0x4 +0x8 cth_objtoff
190 |||| Symidx 0t15 0t43 0t44
191 |||| +------+------+------+-----+
192 +----------| 0x00 | 0x42 | 0x36 | ... |
193 ||| +------+------+------+-----+
194 ||| cth_objtoff +0x2 +0x4 +0x6 cth_funcoff
196 ||| + CTF_TYPE_INFO + CTF_TYPE_INFO
197 ||| | + Return type |
199 ||| +--------+------+------+-----+
200 +---------| 0x2c10 | 0x08 | 0x0c | ... |
201 || +--------+------+------+-----+
202 || cth_funcff +0x2 +0x4 +0x6 cth_typeoff
204 || + ctf_stype_t for type 1
205 || | integer + integer encoding
206 || | | + ctf_stype_t for type 2
208 || +--------------------+-----------+-----+
209 +--------| 0x19 * 0xc01 * 0x0 | 0x1000000 | ... |
210 | +--------------------+-----------+-----+
211 | cth_typeoff +0x08 +0x0c cth_stroff
214 | | +--- str 1 + str 2
217 | +----+---+---+---+----+---+---+---+---+---+----+
218 +---| \\0 | i | n | t | \\0 | f | o | o | _ | t | \\0 |
219 +----+---+---+---+----+---+---+---+---+---+----+
220 0 1 2 3 4 5 6 7 8 9 10 11
231 is defined as follows:
233 typedef struct ctf_preamble {
234 ushort_t ctp_magic; /* magic number (CTF_MAGIC) */
235 uchar_t ctp_version; /* data format version number (CTF_VERSION) */
236 uchar_t ctp_flags; /* flags (see below) */
242 is four bytes long and must be four byte aligned.
245 defines the version of the
247 file which defines the format of the rest of the header.
248 While the header may change in subsequent versions, the preamble will not change
249 across versions, though the interpretation of its flags may change from
253 member defines the magic number for the
258 If another value is encountered, then the file should not be treated as
264 member defines the version of the
267 The current version is
269 It is possible to encounter an unsupported version.
270 In that case, software should not try to parse the format, as it may have
274 member describes aspects of the file which modify its interpretation.
275 The following flags are currently defined:
277 #define CTF_F_COMPRESS 0x01
282 indicates that the body of the
284 file, all the data following the
286 has been compressed through the
291 If this flag is not present, then the body has not been compressed and no
292 special action is needed to interpret it.
293 All offsets into the data as described by
299 In version two of the
303 denotes whether whether or not this
305 file is the child of another
307 file and also indicates the size of the remaining sections.
308 The structure for the
310 logically contains a copy of the
312 and the two have a combined size of 36 bytes.
314 typedef struct ctf_header {
315 ctf_preamble_t cth_preamble;
316 uint_t cth_parlabel; /* ref to name of parent lbl uniq'd against */
317 uint_t cth_parname; /* ref to basename of parent */
318 uint_t cth_lbloff; /* offset of label section */
319 uint_t cth_objtoff; /* offset of object section */
320 uint_t cth_funcoff; /* offset of function section */
321 uint_t cth_typeoff; /* offset of type section */
322 uint_t cth_stroff; /* offset of string section */
323 uint_t cth_strlen; /* length of string section in bytes */
333 are used to identify the parent.
334 The value of both members are offsets into the
336 section which point to the start of a null-terminated string.
337 For more information on the encoding of strings, see the subsection on
338 .Sx String Identifiers .
339 If the value of either is zero, then there is no entry for that
345 member must be set, otherwise it will not be possible to find the
349 is set, it is not necessary to define
351 as the parent may not have a label.
352 For more information on labels and their interpretation, see
353 .Sx The Label Section .
355 The remaining members (excepting
357 describe the beginning of the corresponding sections.
358 These offsets are relative to the end of the
360 Therefore, something with an offset of 0 is at an offset of thirty-six
361 bytes relative to the start of the
364 The difference between members indicates the size of the section itself.
365 Different offsets have different alignment requirements.
370 must be two byte aligned, while the sections
374 must be four-byte aligned.
377 has no alignment requirements.
378 To calculate the size of a given section, excepting the
380 section, one should subtract the offset of the section from the following one.
381 For example, the size of the
383 section can be calculated by subtracting
390 describes the length of the string section itself.
391 From it, you can also calculate the size of the entire
393 file by adding together the size of the
395 the offset of the string section in
397 and the size of the string section in
402 data, types are referred to by identifiers.
405 file supports up to 32767 (0x7fff) types.
406 The first valid type identifier is 0x1.
409 file is a child, indicated by a non-zero entry for the
412 then the first valid type identifier is 0x8000 and the last is 0xffff.
413 In this case, type identifiers 0x1 through 0x7fff are references to the
416 The type identifier zero is a sentinel value used to indicate that there
417 is no type information available or it is an unknown type.
419 Throughout the file format, the identifier is stored in different sized
420 values; however, the minimum size to represent a given identifier is a
424 information may use larger or opaque identifiers.
425 .Ss String Identifiers
426 String identifiers are always encoded as four byte unsigned integers
427 which are an offset into a string table.
430 format supports two different string tables which have an identifier of
432 This identifier is stored in the high-order bit of the unsigned four byte
434 Therefore, the maximum supported offset into one of these tables is 0x7ffffffff.
436 Table identifier zero, always refers to the
438 section in the CTF file itself.
439 String table identifier one refers to an external string table which is the ELF
440 string table for the ELF symbol table associated with the
446 type begins with metadata encoded into a
448 This encoded information tells us three different pieces of information:
449 .Bl -bullet -offset indent -compact
453 Whether this type is a root type or not
455 The length of the variable data
458 The 16 bits that make up the encoding are broken down such that you have
459 five bits for the kind, one bit for indicating whether or not it is a
460 root type, and 10 bits for the variable length.
461 This is laid out as follows:
462 .Bd -literal -offset indent
463 +--------------------+
464 | kind | root | vlen |
465 +--------------------+
469 The current version of the file format defines 14 different kinds.
470 The interpretation of these different kinds will be discussed in the section
471 .Sx The Type Section .
472 If a kind is encountered that is not listed below, then it is not a valid
475 The kinds are defined as follows:
476 .Bd -literal -offset indent
477 #define CTF_K_UNKNOWN 0
478 #define CTF_K_INTEGER 1
479 #define CTF_K_FLOAT 2
480 #define CTF_K_POINTER 3
481 #define CTF_K_ARRAY 4
482 #define CTF_K_FUNCTION 5
483 #define CTF_K_STRUCT 6
484 #define CTF_K_UNION 7
486 #define CTF_K_FORWARD 9
487 #define CTF_K_TYPEDEF 10
488 #define CTF_K_VOLATILE 11
489 #define CTF_K_CONST 12
490 #define CTF_K_RESTRICT 13
493 Programs directly reference many types; however, other types are referenced
494 indirectly because they are part of some other structure.
495 These types that are referenced directly and used are called
498 Other types may be used indirectly, for example, a program may reference
499 a structure directly, but not one of its members which has a type.
500 That type is not considered a
505 type, then it will have bit 10 set.
507 The variable length section is specific to each kind and is discussed in the
509 .Sx The Type Section .
511 The following macros are useful for constructing and deconstructing the encoded
513 .Bd -literal -offset indent
515 #define CTF_MAX_VLEN 0x3ff
516 #define CTF_INFO_KIND(info) (((info) & 0xf800) >> 11)
517 #define CTF_INFO_ISROOT(info) (((info) & 0x0400) >> 10)
518 #define CTF_INFO_VLEN(info) (((info) & CTF_MAX_VLEN))
520 #define CTF_TYPE_INFO(kind, isroot, vlen) \\
521 (((kind) << 11) | (((isroot) ? 1 : 0) << 10) | ((vlen) & CTF_MAX_VLEN))
523 .Ss The Label Section
526 data, it is often useful to know whether two different
528 containers come from the same source base and version.
529 For example, when building illumos, there are many kernel modules that are built
530 against a single collection of source code.
531 A label is encoded into the
533 files that corresponds with the particular build.
534 This ensures that if files on the system were to become mixed up from multiple
535 releases, that they are not used together by tools, particularly when a child
536 needs to refer to a type in the parent.
537 Because they are linked used the type identifiers, if the wrong parent is used
538 then the wrong type will be encountered.
540 Each label is encoded in the file format using the following eight byte
543 typedef struct ctf_lblent {
544 uint_t ctl_label; /* ref to name of label */
545 uint_t ctl_typeidx; /* last type associated with this label */
549 Each label has two different components, a name and a type identifier.
550 The name is encoded in the
552 member which is in the format defined in the section
553 .Sx String Identifiers .
554 Generally, the names of all labels are found in the internal string
557 The type identifier encoded in the member
559 refers to the last type identifier that a label refers to in the current
561 Labels only refer to types in the current file, if the
563 file is a child, then it will have the same label as its parent;
564 however, its label will only refer to its types, not its parents.
566 It is also possible, though rather uncommon, for a
568 file to have multiple labels.
569 Labels are placed one after another, every eight bytes.
570 When multiple labels are present, types may only belong to a single label.
571 .Ss The Object Section
572 The object section provides a mapping from ELF symbols of type
574 in the symbol table to a type identifier.
575 Every entry in this section is a
577 which contains a type identifier as described in the section
578 .Sx Type Identifiers .
579 If there is no information for an object, then the type identifier 0x0
580 is stored for that entry.
582 To walk the object section, you need to have a corresponding
584 in the ELF object that contains the
587 Not every object is included in this section.
588 Specifically, when walking the symbol table.
589 An entry is skipped if it matches any of the following conditions:
591 .Bl -bullet -offset indent -compact
593 The symbol type is not
596 The symbol's section index is
599 The symbol's name offset is zero
601 The symbol's section index is
603 and the value of the symbol is zero.
609 These are skipped because they are used for scoping local symbols in
613 The following sample code shows an example of iterating the object
614 section and skipping the correct symbols:
620 * Given the start of the object section in the CTF file, the number of symbols,
621 * and the ELF Data sections for the symbol table and the string table, this
622 * prints the type identifiers that correspond to objects. Note, a more robust
623 * implementation should ensure that they don't walk beyond the end of the CTF
627 walk_symbols(uint16_t *objtoff, Elf_Data *symdata, Elf_Data *strdata,
631 uintptr_t strbase = strdata->d_buf;
633 for (i = 1; i < nsyms; i++, objftoff++) {
637 if (gelf_getsym(symdata, i, &sym) == NULL)
640 if (GELF_ST_TYPE(sym.st_info) != STT_OBJECT)
642 if (sym.st_shndx == SHN_UNDEF || sym.st_name == 0)
644 if (sym.st_shndx == SHN_ABS && sym.st_value == 0)
646 name = (const char *)(strbase + sym.st_name);
647 if (strcmp(name, "_START_") == 0 || strcmp(name, "_END_") == 0)
650 (void) printf("Symbol %d has type %d\n", i, *objtoff);
656 .Ss The Function Section
657 The function section of the
659 file encodes the types of both the function's arguments and the function's
662 .Sx The Object Section ,
663 the function section encodes information for all symbols of type
665 excepting those that fit specific criteria.
666 Unlike with objects, because functions have a variable number of arguments, they
667 start with a type encoding as defined in
669 which is the size of a
671 For functions which have no type information available, they are encoded as
672 .Li CTF_TYPE_INFO(CTF_K_UNKNOWN, 0, 0) .
673 Functions with arguments are encoded differently.
674 Here, the variable length is turned into the number of arguments in the
678 type function, then the number of arguments is increased by one.
679 Functions with type information are encoded as:
680 .Li CTF_TYPE_INFO(CTF_K_FUNCTION, 0, nargs) .
682 For functions that have no type information, nothing else is encoded, and the
683 next function is encoded.
684 For functions with type information, the next
686 is encoded with the type identifier of the return type of the function.
687 It is followed by each of the type identifiers of the arguments, if any exist,
688 in the order that they appear in the function.
689 Therefore, argument 0 is the first type identifier and so on.
690 When a function has a final varargs argument, that is encoded with the type
694 .Sx The Object Section ,
695 the function section is encoded in the order of the symbol table.
696 It has similar, but slightly different considerations from objects.
697 While iterating the symbol table, if any of the following conditions are true,
698 then the entry is skipped and no corresponding entry is written:
700 .Bl -bullet -offset indent -compact
702 The symbol type is not
705 The symbol's section index is
708 The symbol's name offset is zero
714 These are skipped because they are used for scoping local symbols in
718 The type section is the heart of the
721 It encodes all of the information about the types themselves.
722 The base of the type information comes in two forms, a short form and a long
723 form, each of which may be followed by a variable number of arguments.
724 The following definitions describe the short and long forms:
726 #define CTF_MAX_SIZE 0xfffe /* max size of a type in bytes */
727 #define CTF_LSIZE_SENT 0xffff /* sentinel for ctt_size */
728 #define CTF_MAX_LSIZE UINT64_MAX
730 typedef struct ctf_stype {
731 uint_t ctt_name; /* reference to name in string table */
732 ushort_t ctt_info; /* encoded kind, variant length */
734 ushort_t _size; /* size of entire type in bytes */
735 ushort_t _type; /* reference to another type */
739 typedef struct ctf_type {
740 uint_t ctt_name; /* reference to name in string table */
741 ushort_t ctt_info; /* encoded kind, variant length */
743 ushort_t _size; /* always CTF_LSIZE_SENT */
744 ushort_t _type; /* do not use */
746 uint_t ctt_lsizehi; /* high 32 bits of type size in bytes */
747 uint_t ctt_lsizelo; /* low 32 bits of type size in bytes */
750 #define ctt_size _u._size /* for fundamental types that have a size */
751 #define ctt_type _u._type /* for types that reference another type */
754 Type sizes are stored in
756 The basic small form uses a
758 to store the number of bytes.
759 If the number of bytes in a structure would exceed 0xfffe, then the alternate
763 To indicate that the larger form is being used, the member
768 In general, when going through the type section, consumers use the
770 structure, but pay attention to the value of the member
772 to determine whether they should increment their scan by the size of the
776 Not all kinds of types use
778 Those which do not, will always use the
781 The individual sections for each kind have more information.
783 Types are written out in order.
784 Therefore the first entry encountered has a type id of 0x1, or 0x8000 if a
788 is encoded as described in the section
789 .Sx String Identifiers .
790 The string that it points to is the name of the type.
791 If the identifier points to an empty string (one that consists solely of a null
792 terminator) then the type does not have a name, this is common with anonymous
793 structures and unions that only have a typedef to name them, as well as,
794 pointers and qualifiers.
798 is encoded as described in the section
800 The types kind tells us how to interpret the remaining data in the
802 and any variable length data that may exist.
803 The rest of this section will be broken down into the interpretation of the
805 .Ss Encoding of Integers
806 Integers, which are of type
808 have no variable length arguments.
809 Instead, they are followed by a four byte
811 which describes their encoding.
812 All integers must be encoded with a variable length of zero.
815 member describes the length of the integer in bytes.
816 In general, integer sizes will be rounded up to the closest power of two.
818 The integer encoding contains three different pieces of information:
819 .Bl -bullet -offset indent -compact
821 The encoding of the integer
832 This encoding can be expressed through the following macros:
833 .Bd -literal -offset indent
834 #define CTF_INT_ENCODING(data) (((data) & 0xff000000) >> 24)
835 #define CTF_INT_OFFSET(data) (((data) & 0x00ff0000) >> 16)
836 #define CTF_INT_BITS(data) (((data) & 0x0000ffff))
838 #define CTF_INT_DATA(encoding, offset, bits) \\
839 (((encoding) << 24) | ((offset) << 16) | (bits))
842 The following flags are defined for the encoding at this time:
843 .Bd -literal -offset indent
844 #define CTF_INT_SIGNED 0x01
845 #define CTF_INT_CHAR 0x02
846 #define CTF_INT_BOOL 0x04
847 #define CTF_INT_VARARGS 0x08
850 By default, an integer is considered to be unsigned, unless it has the
855 is set, that indicates that the integer is of a type that stores character
856 data, for example the intrinsic C type
863 is set, that indicates that the integer represents a boolean type.
864 For example, the intrinsic C type
871 indicates that the integer is used as part of a variable number of arguments.
872 This encoding is rather uncommon.
873 .Ss Encoding of Floats
874 Floats, which are of type
876 are similar to their integer counterparts.
877 They have no variable length arguments and are followed by a four byte encoding
878 which describes the kind of float that exists.
881 member is the size, in bytes, of the float.
882 The float encoding has three different pieces of information inside of it:
884 .Bl -bullet -offset indent -compact
886 The specific kind of float that exists
897 This encoding can be expressed through the following macros:
898 .Bd -literal -offset indent
899 #define CTF_FP_ENCODING(data) (((data) & 0xff000000) >> 24)
900 #define CTF_FP_OFFSET(data) (((data) & 0x00ff0000) >> 16)
901 #define CTF_FP_BITS(data) (((data) & 0x0000ffff))
903 #define CTF_FP_DATA(encoding, offset, bits) \\
904 (((encoding) << 24) | ((offset) << 16) | (bits))
907 Where as the encoding for integers was a series of flags, the encoding for
908 floats maps to a specific kind of float.
909 It is not a flag-based value.
910 The kinds of floats correspond to both their size, and the encoding.
911 This covers all of the basic C intrinsic floating point types.
912 The following are the different kinds of floats represented in the encoding:
913 .Bd -literal -offset indent
914 #define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */
915 #define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */
916 #define CTF_FP_CPLX 3 /* Complex encoding */
917 #define CTF_FP_DCPLX 4 /* Double complex encoding */
918 #define CTF_FP_LDCPLX 5 /* Long double complex encoding */
919 #define CTF_FP_LDOUBLE 6 /* Long double encoding */
920 #define CTF_FP_INTRVL 7 /* Interval (2x32-bit) encoding */
921 #define CTF_FP_DINTRVL 8 /* Double interval (2x64-bit) encoding */
922 #define CTF_FP_LDINTRVL 9 /* Long double interval (2x128-bit) encoding */
923 #define CTF_FP_IMAGRY 10 /* Imaginary (32-bit) encoding */
924 #define CTF_FP_DIMAGRY 11 /* Long imaginary (64-bit) encoding */
925 #define CTF_FP_LDIMAGRY 12 /* Long double imaginary (128-bit) encoding */
927 .Ss Encoding of Arrays
928 Arrays, which are of type
930 have no variable length arguments.
931 They are followed by a structure which describes the number of elements in the
932 array, the type identifier of the elements in the array, and the type identifier
933 of the index of the array.
936 member is set to zero.
937 The structure that follows an array is defined as:
939 typedef struct ctf_array {
940 ushort_t cta_contents; /* reference to type of array contents */
941 ushort_t cta_index; /* reference to type of array index */
942 uint_t cta_nelems; /* number of elements */
952 are type identifiers which are encoded as per the section
953 .Sx Type Identifiers .
956 is a simple four byte unsigned count of the number of elements.
957 This count may be zero when encountering C99's flexible array members.
958 .Ss Encoding of Functions
959 Function types, which are of type
961 use the variable length list to be the number of arguments in the function.
962 When the function has a final member which is a varargs, then the argument count
963 is incremented by one to account for the variable argument.
966 member is encoded with the type identifier of the return type of the function.
969 member is not used here.
971 The variable argument list contains the type identifiers for the arguments of
972 the function, if any.
973 Each one is represented by a
975 and encoded according to the
978 If the function's last argument is of type varargs, then it is also written out,
979 but the type identifier is zero.
980 This is included in the count of the function's arguments.
981 .Ss Encoding of Structures and Unions
982 Structures and Unions, which are encoded with
986 respectively, are very similar constructs in C.
987 The main difference between them is the fact that every member of a structure
988 follows one another, where as in a union, all members share the same memory.
989 They are also very similar in terms of their encoding in
991 The variable length argument for structures and unions represents the number of
992 members that they have.
993 The value of the member
995 is the size of the structure and union.
996 There are two different structures which are used to encode members in the
998 When the size of a structure or union is greater than or equal to the large
999 member threshold, 8192, then a different structure is used to encode the member,
1000 all members are encoded using the same structure.
1001 The structure for members is as follows:
1003 typedef struct ctf_member {
1004 uint_t ctm_name; /* reference to name in string table */
1005 ushort_t ctm_type; /* reference to type of member */
1006 ushort_t ctm_offset; /* offset of this member in bits */
1009 typedef struct ctf_lmember {
1010 uint_t ctlm_name; /* reference to name in string table */
1011 ushort_t ctlm_type; /* reference to type of member */
1012 ushort_t ctlm_pad; /* padding */
1013 uint_t ctlm_offsethi; /* high 32 bits of member offset in bits */
1014 uint_t ctlm_offsetlo; /* low 32 bits of member offset in bits */
1022 refer to the name of the member.
1023 The name is encoded as an offset into the string table as described by the
1025 .Sx String Identifiers .
1030 both refer to the type of the member.
1031 They are encoded as per the section
1032 .Sx Type Identifiers .
1034 The last piece of information that is present is the offset which describes the
1035 offset in memory that the member begins at.
1036 For unions, this value will always be zero because the start of unions in memory
1038 For structures, this is the offset in
1040 that the member begins at.
1041 Note that a compiler may lay out a type with padding.
1042 This means that the difference in offset between two consecutive members may be
1043 larger than the size of the member.
1044 When the size of the overall structure is strictly less than 8192 bytes, the
1047 is used and the offset in bits is stored in the member
1049 However, when the size of the structure is greater than or equal to 8192 bytes,
1050 then the number of bits is split into two 32-bit quantities.
1053 represents the upper 32 bits of the offset, while the other member,
1055 represents the lower 32 bits of the offset.
1056 These can be joined together to get a 64-bit sized offset in bits by shifting
1059 to the left by thirty two and then doing a binary or of
1061 .Ss Encoding of Enumerations
1062 Enumerations, noted by the type
1064 are similar to structures.
1065 Enumerations use the variable list to note the number of values that the
1066 enumeration contains, which we'll term enumerators.
1067 In C, an enumeration is always equivalent to the intrinsic type
1069 thus the value of the member
1071 is always the size of an integer which is determined based on the current model.
1072 For illumos systems, this will always be 4, as an integer is always defined to
1073 be 4 bytes large in both
1077 regardless of the architecture.
1079 The enumerators encoded in an enumeration have the following structure in the
1082 typedef struct ctf_enum {
1083 uint_t cte_name; /* reference to name in string table */
1084 int cte_value; /* value associated with this name */
1090 refers to the name of the enumerator's value, it is encoded according to the
1091 rules in the section
1092 .Sx String Identifiers .
1095 contains the integer value of this enumerator.
1096 .Ss Encoding of Forward References
1097 Forward references, types of kind
1101 file refer to types which may not have a definition at all, only a name.
1104 file is a child, then it may be that the forward is resolved to an
1105 actual type in the parent, otherwise the definition may be in another
1107 container or may not be known at all.
1108 The only member of the
1110 that matters for a forward declaration is the
1112 which points to the name of the forward reference in the string table as
1114 There is no other information recorded for forward references.
1115 .Ss Encoding of Pointers, Typedefs, Volatile, Const, and Restrict
1116 Pointers, typedefs, volatile, const, and restrict are all similar in
1118 They all refer to another type.
1119 In the case of typedefs, they provide an alternate name, while volatile, const,
1120 and restrict change how the type is interpreted in the C programming language.
1126 .Sy CTF_K_VOLATILE ,
1127 .Sy CTF_K_RESTRICT ,
1131 These types have no variable list entries and use the member
1133 to refer to the base type that they modify.
1134 .Ss Encoding of Unknown Types
1137 are used to indicate gaps in the type identifier space.
1138 These entries consume an identifier, but do not define anything.
1139 Nothing should refer to these gap identifiers.
1140 .Ss Dependencies Between Types
1141 C types can be imagined as a directed, cyclic, graph.
1142 Structures and unions may refer to each other in a way that creates a cyclic
1144 In cases such as these, the entire type section must be read in and processed.
1145 Consumers must not assume that every type can be laid out in dependency order;
1147 .Ss The String Section
1148 The last section of the
1153 This section encodes all of the strings that appear throughout the other
1155 It is laid out as a series of characters followed by a null terminator.
1156 Generally, all names are written out in ASCII, as most C compilers do not allow
1157 and characters to appear in identifiers outside of a subset of ASCII.
1158 However, any extended characters sets should be written out as a series of UTF-8
1161 The first entry in the section, at offset zero, is a single null
1162 terminator to reference the empty string.
1163 Following that, each C string should be written out, including the null
1165 Offsets that refer to something in this section should refer to the first byte
1166 which begins a string.
1167 Beyond the first byte in the section being the null terminator, the order of
1168 strings is unimportant.
1169 .Sh Data Encoding and ELF Considerations
1171 data is generally included in ELF objects which specify information to
1172 identify the architecture and endianness of the file.
1175 container inside such an object must match the endianness of the ELF object.
1176 Aside from the question of the endian encoding of data, there should be no other
1177 differences between architectures.
1178 While many of the types in this document refer to non-fixed size C integral
1179 types, they are equivalent in the models
1183 If any other model is being used with
1185 data that has different sizes, then it must not use the model's sizes for
1186 those integral types and instead use the fixed size equivalents based on an
1192 container inside of an ELF object, there are certain conventions that are
1193 expected for the purposes of tooling being able to find the
1196 In particular, a given ELF object should only contain a single
1199 Multiple containers should be merged together into a single one.
1203 file should be included in its own ELF section.
1204 The section's name must be
1206 The type of the section must be
1208 The section should have a link set to the symbol table and its address
1209 alignment must be 4.