1 ========================
2 Decodetree Specification
3 ========================
5 A *decodetree* is built from instruction *patterns*. A pattern may
6 represent a single architectural instruction or a group of same, depending
7 on what is convenient for further processing.
9 Each pattern has both *fixedbits* and *fixedmask*, the combination of which
10 describes the condition under which the pattern is matched::
12 (insn & fixedmask) == fixedbits
14 Each pattern may have *fields*, which are extracted from the insn and
15 passed along to the translator. Examples of such are registers,
16 immediates, and sub-opcodes.
18 In support of patterns, one may declare *fields*, *argument sets*, and
19 *formats*, each of which may be re-used to simplify further definitions.
26 field_def := '%' identifier ( unnamed_field )* ( !function=identifier )?
27 unnamed_field := number ':' ( 's' ) number
29 For *unnamed_field*, the first number is the least-significant bit position
30 of the field and the second number is the length of the field. If the 's' is
31 present, the field is considered signed. If multiple ``unnamed_fields`` are
32 present, they are concatenated. In this way one can define disjoint fields.
34 If ``!function`` is specified, the concatenated result is passed through the
35 named function, taking and returning an integral value.
37 One may use ``!function`` with zero ``unnamed_fields``. This case is called
38 a *parameter*, and the named function is only passed the ``DisasContext``
39 and returns an integral value extracted from there.
41 A field with no ``unnamed_fields`` and no ``!function`` is in error.
43 FIXME: the fields of the structure into which this result will be stored
44 is restricted to ``int``. Which means that we cannot expand 64-bit items.
48 +---------------------------+---------------------------------------------+
49 | Input | Generated code |
50 +===========================+=============================================+
51 | %disp 0:s16 | sextract(i, 0, 16) |
52 +---------------------------+---------------------------------------------+
53 | %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) |
54 +---------------------------+---------------------------------------------+
55 | %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | |
56 | | extract(i, 1, 1) << 10 | |
57 | | extract(i, 2, 10) |
58 +---------------------------+---------------------------------------------+
59 | %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | |
60 | !function=expand_shimm8 | extract(i, 13, 1)) |
61 +---------------------------+---------------------------------------------+
68 args_def := '&' identifier ( args_elt )+ ( !extern )?
69 args_elt := identifier
71 Each *args_elt* defines an argument within the argument set.
72 Each argument set will be rendered as a C structure "arg_$name"
73 with each of the fields being one of the member arguments.
75 If ``!extern`` is specified, the backing structure is assumed
76 to have been already declared, typically via a second decoder.
78 Argument sets are useful when one wants to define helper functions
79 for the translator functions that can perform operations on a common
80 set of arguments. This can ensure, for instance, that the ``AND``
81 pattern and the ``OR`` pattern put their operands into the same named
82 structure, so that a common ``gen_logic_insn`` may be able to handle
83 the operations common between the two.
85 Argument set examples::
88 &loadstore reg base offset
96 fmt_def := '@' identifier ( fmt_elt )+
97 fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref
98 fixedbit_elt := [01.-]+
99 field_elt := identifier ':' 's'? number
100 field_ref := '%' identifier | identifier '=' '%' identifier
101 args_ref := '&' identifier
103 Defining a format is a handy way to avoid replicating groups of fields
104 across many instruction patterns.
106 A *fixedbit_elt* describes a contiguous sequence of bits that must
107 be 1, 0, or don't care. The difference between '.' and '-'
108 is that '.' means that the bit will be covered with a field or a
109 final 0 or 1 from the pattern, and '-' means that the bit is really
110 ignored by the cpu and will not be specified.
112 A *field_elt* describes a simple field only given a width; the position of
113 the field is implied by its position with respect to other *fixedbit_elt*
116 If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined.
117 Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that.
119 A *field_ref* incorporates a field by reference. This is the only way to
120 add a complex field to a format. A field may be renamed in the process
121 via assignment to another identifier. This is intended to allow the
122 same argument set be used with disjoint named fields.
124 A single *args_ref* may specify an argument set to use for the format.
125 The set of fields in the format must be a subset of the arguments in
126 the argument set. If an argument set is not specified, one will be
127 inferred from the set of fields.
129 It is recommended, but not required, that all *field_ref* and *args_ref*
130 appear at the end of the line, not interleaving with *fixedbit_elf* or
135 @opr ...... ra:5 rb:5 ... 0 ....... rc:5
136 @opi ...... ra:5 lit:8 1 ....... rc:5
143 pat_def := identifier ( pat_elt )+
144 pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt
145 fmt_ref := '@' identifier
146 const_elt := identifier '=' number
148 The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats.
149 A pattern that does not specify a named format will have one inferred
150 from a referenced argument set (if present) and the set of fields.
152 A *const_elt* allows a argument to be set to a constant value. This may
153 come in handy when fields overlap between patterns and one has to
154 include the values in the *fixedbit_elt* instead.
156 The decoder will call a translator function for each pattern matched.
160 addl_r 010000 ..... ..... .... 0000000 ..... @opr
161 addl_i 010000 ..... ..... .... 0000000 ..... @opi
163 which will, in part, invoke::
165 trans_addl_r(ctx, &arg_opr, insn)
169 trans_addl_i(ctx, &arg_opi, insn)
176 group := '{' ( pat_def | group )+ '}'
178 A *group* begins with a lone open-brace, with all subsequent lines
179 indented two spaces, and ending with a lone close-brace. Groups
180 may be nested, increasing the required indentation of the lines
181 within the nested group to two spaces per nesting level.
183 Unlike ungrouped patterns, grouped patterns are allowed to overlap.
184 Conflicts are resolved by selecting the patterns in order. If all
185 of the fixedbits for a pattern match, its translate function will
186 be called. If the translate function returns false, then subsequent
187 patterns within the group will be matched.
189 The following example from PA-RISC shows specialization of the *or*
194 nop 000010 ----- ----- 0000 001001 0 00000
195 copy 000010 00000 r1:5 0000 001001 0 rt:5
197 or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5
200 When the *cf* field is zero, the instruction has no side effects,
201 and may be specialized. When the *rt* field is zero, the output
202 is discarded and so the instruction has no effect. When the *rt2*
203 field is zero, the operation is ``reg[rt] | 0`` and so encodes
204 the canonical register copy operation.
206 The output from the generator might look like::
208 switch (insn & 0xfc000fe0) {
210 /* 000010.. ........ ....0010 010..... */
211 if ((insn & 0x0000f000) == 0x00000000) {
212 /* 000010.. ........ 00000010 010..... */
213 if ((insn & 0x0000001f) == 0x00000000) {
214 /* 000010.. ........ 00000010 01000000 */
215 extract_decode_Fmt_0(&u.f_decode0, insn);
216 if (trans_nop(ctx, &u.f_decode0)) return true;
218 if ((insn & 0x03e00000) == 0x00000000) {
219 /* 00001000 000..... 00000010 010..... */
220 extract_decode_Fmt_1(&u.f_decode1, insn);
221 if (trans_copy(ctx, &u.f_decode1)) return true;
224 extract_decode_Fmt_2(&u.f_decode2, insn);
225 if (trans_or(ctx, &u.f_decode2)) return true;