1 The following text is a brief overview of those key
2 principles which are useful to know when generating code
3 with SLJIT. Further details can be found in sljitLir.h.
5 ----------------------------------------------------------------
7 ----------------------------------------------------------------
9 SLJIT is a platform independent assembler which
10 - provides access to common CPU features
11 - can be easily ported to wide-spread CPU
12 architectures (e.g. x86, ARM, POWER, MIPS, SPARC)
14 The key challenge of this project is finding a common
15 subset of CPU features which
16 - covers traditional assembly level programming
17 - can be translated to machine code efficiently
19 This aim is achieved by selecting those instructions / CPU
20 features which are either available on all platforms or
21 simulating them has a low performance overhead.
23 For example, some SLJIT instructions support base register
24 pre-update when [base+offs] memory accessing mode is used.
25 Although this feature is only available on ARM and POWER
26 CPUs, the simulation overhead is low on other CPUs.
28 ----------------------------------------------------------------
29 The generic CPU model of SLJIT
30 ----------------------------------------------------------------
33 - integer registers, which can store either an
34 int32_t (4 byte) or intptr_t (4 or 8 byte) value
35 - floating point registers, which can store either a
36 single (4 byte) or double (8 byte) precision value
37 - boolean status flags
39 *** Integer registers:
41 The most important rule is: when a source operand of
42 an instruction is a register, the data type of the
43 register must match the data type expected by an
46 For example, the following code snippet
47 is a valid instruction sequence:
49 sljit_emit_op1(compiler, SLJIT_IMOV,
50 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
51 // An int32_t value is loaded into SLJIT_R0
52 sljit_emit_op1(compiler, SLJIT_INEG,
53 SLJIT_R0, 0, SLJIT_R0, 0);
54 // the int32_t value in SLJIT_R0 is negated
55 // and the type of the result is still int32_t
57 The next code snippet is not allowed:
59 sljit_emit_op1(compiler, SLJIT_MOV,
60 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
61 // An intptr_t value is loaded into SLJIT_R0
62 sljit_emit_op1(compiler, SLJIT_INEG,
63 SLJIT_R0, 0, SLJIT_R0, 0);
64 // The result of SLJIT_INEG instruction
65 // is undefined. Even crash is possible
68 However, it is always allowed to overwrite a
69 register regardless its previous value:
71 sljit_emit_op1(compiler, SLJIT_MOV,
72 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R1), 0);
73 // An intptr_t value is loaded into SLJIT_R0
74 sljit_emit_op1(compiler, SLJIT_IMOV,
75 SLJIT_R0, 0, SLJIT_MEM1(SLJIT_R2), 0);
76 // From now on SLJIT_R0 contains an int32_t
77 // value. The previous value is discarded.
79 Type conversion instructions are provided to convert an
80 int32_t value to an intptr_t value and vice versa. In
81 certain architectures these conversions are nops (no
82 instructions are emitted).
86 Registers arguments of SLJIT_MEM1 / SLJIT_MEM2 addressing
87 modes must contain intptr_t data.
89 Signed / unsigned values:
91 Most operations are executed in the same way regardless
92 the value is signed or unsigned. These operations have
93 only one instruction form (e.g. SLJIT_ADD / SLJIT_MUL).
94 Instructions where the result depends on the sign have
95 two forms (e.g. integer division, long multiply).
97 *** Floating point registers
99 Floating point registers can either contain a single
100 or double precision value. Similar to integer registers,
101 the data type of the value stored in a source register
102 must match the data type expected by the instruction.
103 Otherwise the result is undefined (even crash is possible).
107 Similar to standard C, floating point computation
108 results are rounded toward zero.
110 *** Boolean status flags:
112 Conditional branches usually depend on the value
113 of CPU status flags. These status flags are boolean
114 values and can be set by certain instructions.
116 To achive maximum efficiency and portability, the
117 following rules were introduced:
118 - Most instructions can freely modify these status
119 flags except if SLJIT_KEEP_FLAGS is passed.
120 - The SLJIT_KEEP_FLAGS option may have a performance
121 overhead, so it should only be used when necessary.
122 - The SLJIT_SET_E, SLJIT_SET_U, etc. options can
123 force an instruction to correctly set the
124 specified status flags. However, all other
125 status flags are undefined. This rule must
126 always be kept in mind!
127 - Status flags cannot be controlled directly
128 (there are no set/clear/invert operations)
130 The last two rules allows efficent mapping of status flags.
131 For example the arithmetic and multiply overflow flag is
132 mapped to the same overflow flag bit on x86. This is allowed,
133 since no instruction can set both of these flags. When
134 either of them is set by an instruction, the other can
135 have any value (this satisfies the "all other flags are
136 undefined" rule). Therefore mapping two SLJIT flags to the
137 same CPU flag is possible. Even though SLJIT supports
138 a dozen status flags, they can be efficiently mapped
139 to CPUs with only 4 status flags (e.g. ARM or SPARC).
141 ----------------------------------------------------------------
143 ----------------------------------------------------------------
145 We noticed, that introducing complex instructions for common
146 tasks can improve performance. For example, compare and
147 branch instruction sequences can be optimized if certain
148 conditions apply, but these conditions depend on the target
149 CPU. SLJIT can do these optimizations, but it needs to
150 understand the "purpose" of the generated code. Static
151 instruction analysis has a large performance overhead
152 however, so we choose another approach: we introduced
153 complex instruction forms for certain non-atomic tasks.
154 SLJIT can optimize these "instructions" more efficiently
155 since the "purpose" is known to the compiler. These complex
156 instruction forms can often be assembled from other SLJIT
157 instructions, but we recommended to use them since the
158 compiler can optimize them on certain CPUs.
160 ----------------------------------------------------------------
162 ----------------------------------------------------------------
164 SLJIT is often used for generating function bodies which are
165 called from C. SLJIT provides two complex instructions for
166 generating function entry and return: sljit_emit_enter and
167 sljit_emit_return. The sljit_emit_enter also initializes the
168 "compiling context" which specify the current register mapping,
169 local space size, etc. configurations. The sljit_set_context
170 can also set this context without emitting any machine
173 This context is important since it affects the compiler, so
174 the first instruction after a compiler is created must be
175 either sljit_emit_enter or sljit_set_context. The context can
176 be changed by calling sljit_emit_enter or sljit_set_context
179 ----------------------------------------------------------------
181 ----------------------------------------------------------------
183 Instead of using a separate library, the whole SLJIT
184 compiler infrastructure can be directly included:
186 #define SLJIT_CONFIG_STATIC 1
187 #include "sljitLir.c"
189 This approach is useful for single file compilers.
192 - Everything provided by SLJIT is available
193 (no need to include anything else).
194 - Configuring SLJIT is easy
195 (e.g. redefining SLJIT_MALLOC / SLJIT_FREE).
196 - The SLJIT compiler API is hidden from the
197 world which improves securtity.
198 - The C compiler can optimize the SLJIT code
199 generator (e.g. removing unused functions).
201 ----------------------------------------------------------------
203 ----------------------------------------------------------------
205 The sljitConfig.h contains those defines, which controls
206 the compiler. The beginning of sljitConfigInternal.h
207 lists architecture specific types and macros provided
208 by SLJIT. Some of these macros:
210 SLJIT_DEBUG : enabled by default
211 Enables assertions. Should be disabled in release mode.
213 SLJIT_VERBOSE : enabled by default
214 When this macro is enabled, the sljit_compiler_verbose
215 function can be used to dump SLJIT instructions.
216 Otherwise this function is not available. Should be
217 disabled in release mode.
219 SLJIT_SINGLE_THREADED : disabled by default
220 Single threaded programs can define this flag which
221 eliminates the pthread dependency.
223 sljit_sw, sljit_uw, etc. :
224 It is recommended to use these types instead of long,
225 intptr_t, etc. Improves readability / portability of