cosmetix
[k8flk.git] / fth / flkasm.fs
blob0dcc23e0b3e4bacdb5790a72a875e1b52f170f42
1 \ FLK simple postfix assembler
3 \ Copyright (C) 1998 Lars Krueger
5 \ This file is part of FLK.
7 \ FLK is free software; you can redistribute it and/or
8 \ modify it under the terms of the GNU General Public License
9 \ as published by the Free Software Foundation; either version 2
10 \ of the License, or (at your option) any later version.
12 \ This program is distributed in the hope that it will be useful,
13 \ but WITHOUT ANY WARRANTY; without even the implied warranty of
14 \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 \ GNU General Public License for more details.
17 \ You should have received a copy of the GNU General Public License
18 \ along with this program; if not, write to the Free Software
19 \ Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
21 \ $Id: flkasm.fs,v 1.14 1998/09/20 09:34:14 root Exp $
22 \ $Log: flkasm.fs,v $
23 \ Revision 1.14 1998/09/20 09:34:14 root
24 \ changed ebp offset to byte (saving 3 bytes per stack flush)
26 \ Revision 1.13 1998/09/13 15:42:04 root
27 \ introduced separate control flow stack
29 \ Revision 1.12 1998/07/16 19:31:37 root
30 \ added conditional near jumps
32 \ Revision 1.11 1998/07/13 18:08:54 root
33 \ added [esp]
35 \ Revision 1.10 1998/07/03 20:57:50 root
36 \ ?+relocate added
38 \ Revision 1.9 1998/07/03 09:09:28 root
39 \ support for level 2 optimizer
41 \ Revision 1.8 1998/06/08 22:14:51 root
42 \ added SIB-addressing
44 \ Revision 1.7 1998/06/03 16:31:37 root
45 \ added int,
47 \ Revision 1.6 1998/05/01 18:11:25 root
48 \ GNU license text added
49 \ comments checked
51 \ Revision 1.5 1998/04/30 09:42:25 root
52 \ Comments added.
54 \ Revision 1.4 1998/04/24 16:47:39 root
55 \ added some instructions (fpu)
56 \ corrected allocator-store
58 \ Revision 1.3 1998/04/16 18:43:49 root
59 \ flow control improved
61 \ Revision 1.2 1998/04/15 18:15:30 root
62 \ float support
64 \ Revision 1.1 1998/04/07 20:10:33 root
65 \ Initial revision
68 \ This file implements a classical postfix assembler for Intel 386+ and based
69 \ on that a register allocator for the optimizer. All operations are
70 \ "src dest op," or "dest op," . A postfix assembler is controlled by a few
71 \ variables that contain information about the type of operand, its size,
72 \ register number and offset. One set of variables is required per possible
73 \ operand i.e. a CPU supports commands like "add r0,r1,r17" meaning "add the
74 \ contents of r1 and r17 and store the result in r0", three sets of variables
75 \ are used. Apart from some stranger multiplication operations the Intel 386
76 \ never uses more than 2 operands, further referred to as source and
77 \ destination.
79 \ At the begin of the operation and after each stored operation a pointer has
80 \ to be re-set that tells the system whether the source or the destination set
81 \ of variables is meant with the given operand.
83 \ For a detailed description of the 386 and above see Intels manuals.
85 \ The next few constants are here for readability.
87 \ The size of the operand.
88 0 CONSTANT SZ-8 \ al through dh
89 1 CONSTANT SZ-32 \ eax through esi
90 2 CONSTANT SZ-UNKNOWN \ for memory references
92 \ The type of the operand.
93 0 CONSTANT RT-REG \ register
94 1 CONSTANT RT-INDEX \ [register]
95 2 CONSTANT RT-IMMED \ literal
96 3 CONSTANT RT-ABS \ [literal]
98 \ The scale of a sib-byte
99 0 CONSTANT SC-1
100 1 CONSTANT SC-2
101 2 CONSTANT SC-4
102 3 CONSTANT SC-8
104 \ Some error messages are nessesary too.
105 \ When this messages occurs something really went wrong.
106 : (internal-error) ( -- )
107 CR ." This is an internal assembler error." CR BYE ;
108 \ This message is printed whenever an invalid combination of operands is
109 \ found, i.e. after 0 [ebp] 0 [ecx] mov, .
110 : (unknown-combination) ( -- )
111 CR ." Illegal or unimplemented combination of operands." CR ABORT ;
112 \ If called with a true flag in TOS (top of stack) the register cache is full.
113 : TooManyRegs IF ." Too many registers requested." ABORT THEN ;
114 \ Called with TRUE when the sizes of the operands are different.
115 : (sz-mismatch) IF ." Mismatching sizes of operands." ABORT THEN ;
117 \ Called to complain about unknown size.
118 : (unknown-size) ( size -- )
119 SZ-UNKNOWN = IF ." Unknown operand size." ABORT THEN ;
121 \ Called to complain about wrong size.
122 : (wrong-size) ( wrong? -- )
123 IF ." Wrong operand size." ABORT THEN ;
126 \ These words check a type value for the given property.
127 : isreg? ( type -- flag )
128 RT-REG = ;
129 : ismem? ( type -- flag )
130 DUP RT-INDEX = \ type flag2
131 SWAP RT-ABS = OR ;
132 : isr/m? ( type -- flag )
133 DUP isreg? \ type flag1
134 SWAP ismem? OR ;
135 : isimm? ( type -- flag )
136 RT-IMMED = ;
138 \ The sets of variables containing the details of the operands.
139 VARIABLE (ts) VARIABLE (td) \ type
140 VARIABLE (rs) VARIABLE (rd) \ register number
141 VARIABLE (ss) VARIABLE (sd) \ size
142 VARIABLE (os) VARIABLE (od) \ offset/immediate ...
144 VARIABLE (sb) VARIABLE (db) \ sib base
145 VARIABLE (sc) VARIABLE (dc) \ sib scale
146 VARIABLE (si) VARIABLE (di) \ sib index
148 \ Word operations in the 386+ are distinguished from byte operations by
149 \ setting Bit 0 in the opcode. The following VALUE contains this Bit 0
150 0 VALUE (wrd)
152 \ To distinguish between the "normal" (non-SIB) addressing and the
153 \ SIB-addressing mode the following word is used. Setting if to TRUE makes the
154 \ mod/rm byte compiler switch to SIB-mode. It is reset by (asm-reset).
155 FALSE VALUE (sib)
157 \ To generate byte-offsets on demand the next VALUE is used. Does it contain
158 \ TRUE, byte offsets are generated.
159 FALSE VALUE (byte-offs)
161 \ Prefix to switch to byte offsets.
162 : BOFFS TRUE TO (byte-offs) ;
164 \ This word simply sets the word/byte bit in the opcode depending on the state
165 \ of (wrd).
166 : (w+) ( opcode -- opcode )
167 (wrd) + ;
169 \ This VALUE is the pointer telling whether source or destination VARIABLEs
170 \ are meant.
171 0 VALUE (#operands)
173 \ Since no syntax checking is performed (How? Why?) this check gives a minimum
174 \ security that the VARIABLEs contain valid numbers.
175 : (#operands?) ( n -- )
176 2 > IF ." Too many operands." ABORT THEN ;
178 \ Before giving the first operand to an operation the source/destination
179 \ pointer must be set to its initial state. This can be done by the user who
180 \ will forget that now and then or by the system that never forgets this.
181 \ Therefore this word must be executed before assembling the first instruction
182 \ and after the assembly of each instruction.
183 : (0operands) ( -- )
184 0 TO (#operands) ;
186 \ To advance the pointer from source to destination call this word. A check on
187 \ valid number of operands is placed here because this word is called by all
188 \ words that advance the pointer.
189 : (+operand) ( -- )
190 (#operands) DUP (#operands?)
191 1+ TO (#operands) ;
193 \ The next word delivers the address of a source or destination VARIABLE
194 \ depending on the flag given. TRUE means the destination set, FALSE the
195 \ source.
196 : {s/d} ( flag a-src a-dst -- addr )
197 ROT IF
199 ELSE
200 DROP
201 THEN ;
203 \ For safety another check of valid operand number is performed before the
204 \ word decideds between source and destination depending on the state of the
205 \ source/destination pointer.
206 : (s/d) ( addr-source addr-dest -- addr )
207 (#operands) DUP (#operands?)
208 0<> -ROT {s/d} ;
210 \ These words return the address of the variable depending on the state of the
211 \ source/destination pointer.
212 : (type) (ts) (td) (s/d) ;
213 : (size) (ss) (sd) (s/d) ;
214 : (regnum) (rs) (rd) (s/d) ;
215 : (offs) (os) (od) (s/d) ;
217 : (sib-scale) (sc) (dc) (s/d) ;
218 : (sib-index) (si) (di) (s/d) ;
219 : (sib-base) (sb) (db) (s/d) ;
221 \ These words return the address of the variable depending on the flag given.
222 \ FALSE means source, TRUE destination. All words have the stack effect
223 \ ( flag -- addr ) .
224 : {type} (ts) (td) {s/d} ;
225 : {size} (ss) (sd) {s/d} ;
226 : {regnum} (rs) (rd) {s/d} ;
227 : {offs} (os) (od) {s/d} ;
229 : {sib-scale} (sc) (dc) {s/d} ;
230 : {sib-index} (si) (di) {s/d} ;
231 : {sib-base} (sb) (db) {s/d} ;
233 \ Since every instructions requires a different number of operands these
234 \ words perform the check.
235 : need0op ( addr u -- )
236 (#operands) 0 <> IF
237 ." Operation " TYPE ." needs no operands." CR BYE
238 ELSE 2DROP THEN ;
240 : need1op ( addr u -- )
241 (#operands) 1 <> IF
242 ." Operation " TYPE ." needs one operand." CR BYE
243 ELSE 2DROP THEN ;
245 : need2op ( addr u -- )
246 (#operands) 2 <> IF
247 ." Operation " TYPE ." needs two operands." CR BYE
248 ELSE 2DROP THEN ;
250 \ When dealing with relocation tables one needs to know whether a value has to
251 \ be relocated or not. This VALUE does the job.
252 FALSE VALUE (relocate?)
254 \ In some cases the user of the assembler needs to override the default of not
255 \ relocating, i.e. when loading registers with a value that is an address, so
256 \ the next word turns relocation on for the the next call of ar?, .
257 : +relocate ( -- ) TRUE TO (relocate?) ;
259 \ Relocate the next compiled constant depending on the flag.
260 : ?+relocate ( flag -- ) IF +relocate THEN ;
262 \ This word chooses the right version of a the comma-operator depending on the
263 \ relocation flag, which is reset after use.
264 : ar?, ( val -- )
265 (relocate?) IF asm-r, ELSE asm-, THEN
266 FALSE TO (relocate?) ;
268 \ Access to the source and destination operand variables is quite often used,
269 \ so make the job easier and define a few shortcuts.
270 : ts@ (ts) @ ;
271 : td@ (td) @ ;
272 : ss@ (ss) @ ;
273 : sd@ (sd) @ ;
274 : rs@ (rs) @ ;
275 : rd@ (rd) @ ;
276 : os@ (os) @ ;
277 : od@ (od) @ ;
279 \ Whenever the current operand should be a register, this word is called. It
280 \ is used both by the words for using a specific register and the meta
281 \ registers. Remember to give the size, because the number of the register
282 \ alone doesn't tell which register is meant.
283 : (#reg) ( num size -- )
284 RT-REG (type) !
285 (size) !
286 (regnum) !
287 (+operand) ;
289 \ Refer to an indexed address. The base (or offset) is given by the user, the
290 \ number comes either from the words below or from the allocator.
291 : (#[reg]) ( offset num -- )
292 RT-INDEX (type) !
293 (regnum) !
294 (offs) ! (+operand) ;
296 \ Refer to an indexed address. Base, index, scale and offset are given by the
297 \ user.
298 : (#[sib]) ( offs scale index base -- )
299 RT-INDEX (type) !
300 TRUE TO (sib)
301 (sib-base) !
302 (sib-index) !
303 (sib-scale) !
304 (offs) ! (+operand) ;
306 : [esp] ( offs -- )
307 SC-1 4 4 (#[sib]) ;
309 \ Refer to an absolute address. It is nessesary to relocate this address.
310 : #[] ( addr -- )
311 RT-ABS (type) !
312 (offs) ! (+operand) +relocate ;
314 \ The value in TOS is an immediate value. Even though it is allowed only as a
315 \ source, this is not checked here. If the value is an address, +relocate must
316 \ be called before calling the operation assembling word.
317 : ## ( val -- )
318 RT-IMMED (type) !
319 (offs) ! (+operand) ;
321 : r## +relocate ## ;
323 \ Mark the current operand as a double-word (32 bit) or byte (8 bit).
324 \ Nessesary before xx ## [eax] mov,
325 : DWORD SZ-32 (size) ! ;
326 : BYTE SZ-8 (size) ! ;
328 \ Symbolic names for the registers are easier to remember, so provide words
329 \ for both 32 bit and 8 bit registers. Because of the register bl and the
330 \ constant BL collision, the 8 bit registers begin with "reg-" .
331 : eax 0 SZ-32 (#reg) ;
332 : ecx 1 SZ-32 (#reg) ;
333 : edx 2 SZ-32 (#reg) ;
334 : ebx 3 SZ-32 (#reg) ;
335 : esp 4 SZ-32 (#reg) ;
336 : ebp 5 SZ-32 (#reg) ;
337 : esi 6 SZ-32 (#reg) ;
338 : edi 7 SZ-32 (#reg) ;
340 : reg-al 0 SZ-8 (#reg) ;
341 : reg-cl 1 SZ-8 (#reg) ;
342 : reg-dl 2 SZ-8 (#reg) ;
343 : reg-bl 3 SZ-8 (#reg) ;
344 : reg-ah 4 SZ-8 (#reg) ;
345 : reg-ch 5 SZ-8 (#reg) ;
346 : reg-dh 6 SZ-8 (#reg) ;
347 : reg-bh 7 SZ-8 (#reg) ;
349 : [eax] 0 (#[reg]) ;
350 : [ecx] 1 (#[reg]) ;
351 : [edx] 2 (#[reg]) ;
352 : [ebx] 3 (#[reg]) ;
353 : [ebp] 5 (#[reg]) ;
354 : [esi] 6 (#[reg]) ;
355 : [edi] 7 (#[reg]) ;
357 \ This word performs three task. (i) it adjustes the sizes of operands, if one
358 \ size is unknown, (ii) it stops the execution if the sizes of both operands
359 \ and (iii) sets the word operation flag.
360 : (check-sizes) ( -- )
361 ss@ SZ-UNKNOWN = IF sd@ (ss) ! ELSE
362 sd@ SZ-UNKNOWN = IF ss@ (sd) ! THEN THEN
363 ss@ SZ-UNKNOWN = sd@ SZ-UNKNOWN = AND (sz-mismatch)
364 ss@ sd@ <> (sz-mismatch)
365 ss@ SZ-32 = IF 1 ELSE 0 THEN
366 TO (wrd) ;
368 \ Writing an offset or not depends on the type of the memory operand. mop is
369 \ TRUE when the memory operand is the destination operand, rop when the
370 \ register operand is.
371 : (offs/rm), ( mop rop -- )
372 OVER DUP {offs} @ \ mop rop mop offs
373 SWAP {regnum} @ 5 <> SWAP 0=
374 AND IF ( [reg] ) \ mop rop
375 {regnum} @ 8 * SWAP
376 {regnum} @ + asm-c,
377 ELSE ( n [reg] ) \ mop rop
378 OVER SWAP \ mop mop rop
379 {regnum} @ 8 * SWAP
380 {regnum} @ + SWAP \ hmod/rm mop
381 {offs} @ \ hmod/rm offs
382 SWAP
383 (byte-offs) IF
384 64 + asm-c, asm-c,
385 ELSE
386 128 + asm-c, ar?,
387 THEN
388 THEN ;
390 \ This word compiles a mod/rm byte from the settings of operand types,
391 \ register names etc. if no SIB-byte is required.
392 : (mod/rm-no-sib) ( mop rop -- )
393 OVER \ mop rop mop
394 {type} @ \ mop rop mtype
395 RT-ABS OVER = IF DROP \ mop rop
396 {regnum} @ 8 * 5 + asm-c, \ mop
397 {offs} @ ar?, \
398 ELSE \ mop rop mop
399 RT-REG = IF \ mop rop
400 192 SWAP {regnum} @ 8 * + \ mop mod/rm
401 SWAP {regnum} @ + asm-c, \
402 ELSE ( index ) \ mop rop
403 (offs/rm), \
404 THEN
405 THEN ;
407 \ This word compiles a mod/rm byte from the settings of operand types,
408 \ register names etc. if a SIB-byte is required.
409 : (mod/rm-sib) ( mop rop -- )
410 {regnum} @ 8 * OVER {offs} @
411 IF \ mop opcocde
412 128 +
413 THEN
414 4 + asm-C, ( mod/rm ) \ mop
415 DUP 2DUP \ mop mop mop mop
416 {sib-scale} @ 6 LSHIFT \ mop mop mop sib
417 SWAP {sib-index} @ 3 LSHIFT + \ mop mop sib
418 SWAP {sib-base} @ + \ mop sib
419 asm-C,
420 {offs} @ ?DUP IF ar?, THEN ;
422 \ This word is the main work horse of the assembler. It decides whether or not
423 \ to use sib addressing and calls the special compiler words for these cases.
424 : ((mod/rm)), ( mop rop -- )
425 (sib) IF
426 (mod/rm-sib)
427 ELSE
428 (mod/rm-no-sib)
429 THEN ;
431 \ The 386 can handle combinations with at least one memory operand. The
432 \ decision whether this memory operand is the source or the destination of an
433 \ operation is done by the opcode. This word is called with the value TRUE
434 \ when the memory operand is the source operand after the decision
435 \ produced this flag and compiled the opcode.
436 : (mod/rm), ( source-is-rm -- )
437 DUP INVERT SWAP ((mod/rm)), ;
439 \ In the reset-state, the sizes of both operands are unknown and no operands
440 \ have been accepted yet.
441 : (asm-reset) ( -- )
442 (0operands)
443 FALSE TO (sib)
444 SZ-UNKNOWN (ss) !
445 SZ-UNKNOWN (sd) !
446 FALSE TO (byte-offs) ;
448 \ All tools for creating assembler operation words are ready so we can start
449 \ with the actual work.
451 \ The mov operation is a good example how such operation words are written. At
452 \ first the checks for valid number of operands and valid sizes are performed.
453 \ Then all supported register/memory combinations are compared with the
454 \ parameters and the right version is assembled then. If no combination can be
455 \ found, complain about it and leave. After successful assembly reset the
456 \ assembler.
457 : mov, ( -- )
458 S" mov," need2op
459 (check-sizes)
460 ts@ isreg? td@ isr/m? AND IF ( mov r/m, r )
461 $$ 88 (w+) asm-c, FALSE (mod/rm),
462 ELSE
463 ts@ isr/m? td@ isreg? AND IF ( mov r, r/m )
464 $$ 8A (w+) asm-c, TRUE (mod/rm),
465 ELSE
466 td@ isreg? ts@ isimm? AND IF
467 (wrd) 0<> IF $$ B8 ELSE $$ B0 THEN
468 rd@ + asm-c, os@
469 (wrd) 0<> IF ar?, ELSE asm-c, THEN
470 ELSE
471 ts@ isimm? td@ isr/m? AND IF
472 $$ C6 (w+) asm-c,
473 td@ CASE
474 RT-REG OF FALSE $$ C0 rd@ + ENDOF
475 RT-ABS OF TRUE 5 ENDOF
476 RT-INDEX OF TRUE $$ 80 rd@ + ENDOF
477 ." Can't address by ##. Use #[]. " ABORT
478 ENDCASE
479 asm-c, IF od@ asm-, THEN
480 os@ ar?,
481 ELSE
482 (unknown-combination)
483 THEN THEN THEN THEN
484 (asm-reset) ;
486 \ Compile a jmp operation
487 : jmp, ( -- )
488 S" jmp," need1op
489 ts@ RT-IMMED = IF ( jmp 42 )
490 os@ asm-here 5 + -
491 $$ E9 asm-c, asm-,
492 ELSE
493 ts@ isr/m? ss@ SZ-8 <> AND IF
494 ss@ SZ-32 <> (wrong-size)
495 $$ FF asm-c, 4 ss@ (#reg) TRUE (mod/rm),
496 ELSE
497 (unknown-combination)
498 THEN THEN
499 (asm-reset) ;
501 \ Compile a conditional near (32 bit relative) jump
502 : (n-jcc,) ( opcode addr len -- )
503 need1op \ opcode
504 ts@ RT-IMMED = IF
505 $$ 0F asm-c, asm-c, os@ asm-here 4 + -
506 asm-,
507 ELSE
508 (unknown-combination)
509 THEN (asm-reset) ;
511 : n-ja, $$ 87 S" n-ja," (n-jcc,) ;
512 : n-jae, $$ 83 S" n-jae," (n-jcc,) ;
513 : n-jb, $$ 82 S" n-jb," (n-jcc,) ;
514 : n-jbe, $$ 86 S" n-jbe," (n-jcc,) ;
515 : n-jc, $$ 82 S" n-jc," (n-jcc,) ;
516 : n-je, $$ 84 S" n-je," (n-jcc,) ;
517 : n-jg, $$ 8F S" n-jg," (n-jcc,) ;
518 : n-jge, $$ 8D S" n-jge," (n-jcc,) ;
519 : n-jl, $$ 8C S" n-jl," (n-jcc,) ;
520 : n-jle, $$ 8E S" n-jle," (n-jcc,) ;
521 : n-jna, $$ 86 S" n-jna," (n-jcc,) ;
522 : n-jnae, $$ 82 S" n-jnae," (n-jcc,) ;
523 : n-jnb, $$ 83 S" n-jnb," (n-jcc,) ;
524 : n-jnbe, $$ 87 S" n-jnbe," (n-jcc,) ;
525 : n-jnc, $$ 83 S" n-jnc," (n-jcc,) ;
526 : n-jne, $$ 85 S" n-jne," (n-jcc,) ;
527 : n-jng, $$ 8E S" n-jng," (n-jcc,) ;
528 : n-jnge, $$ 8C S" n-jnge," (n-jcc,) ;
529 : n-jnl, $$ 8D S" n-jnl," (n-jcc,) ;
530 : n-jnle, $$ 8F S" n-jnle," (n-jcc,) ;
531 : n-jno, $$ 81 S" n-jno," (n-jcc,) ;
532 : n-jnp, $$ 8B S" n-jnp," (n-jcc,) ;
533 : n-jns, $$ 89 S" n-jns," (n-jcc,) ;
534 : n-jnz, $$ 85 S" n-jnz," (n-jcc,) ;
535 : n-jo, $$ 80 S" n-jo," (n-jcc,) ;
536 : n-jp, $$ 8A S" n-jp," (n-jcc,) ;
537 : n-jpe, $$ 8A S" n-jpe," (n-jcc,) ;
538 : n-jpo, $$ 8B S" n-jpo," (n-jcc,) ;
539 : n-js, $$ 88 S" n-js," (n-jcc,) ;
540 : n-jz, $$ 84 S" n-jz," (n-jcc,) ;
542 \ Compile a call
543 : call, ( -- )
544 S" call," need1op
545 ss@ SZ-UNKNOWN = IF
546 SZ-32 (ss) !
547 THEN
548 ss@ SZ-32 <>
549 IF ." Call address must be 32 bit" ABORT THEN
550 ts@ RT-IMMED = IF ( call 42 )
551 os@ asm-here 5 + -
552 $$ E8 asm-c, asm-,
553 ELSE
554 ts@ isr/m? IF
555 $$ FF asm-c, 2 SZ-32 (#reg) TRUE (mod/rm),
556 ELSE
557 (unknown-combination)
558 THEN THEN
559 (asm-reset) ;
561 \ Compile a single byte instruction
562 : <single-byte> ( byte -- )
563 need0op asm-c, ;
565 : aaa, $$ 37 S" aaa," <single-byte> ;
566 : aas, $$ 3F S" aas," <single-byte> ;
567 : clc, $$ F8 S" clc," <single-byte> ;
568 : cld, $$ FC S" cld," <single-byte> ;
569 : cmc, $$ F5 S" cmc," <single-byte> ;
570 : cdq, $$ 99 S" cdq," <single-byte> ;
571 : cmpsb, $$ A6 S" cmpsb," <single-byte> ;
572 : cmpsd, $$ A7 S" cmpsd," <single-byte> ;
573 : daa, $$ 27 S" daa," <single-byte> ;
574 : das, $$ 2F S" das," <single-byte> ;
575 : movsb, $$ A4 S" movsb," <single-byte> ;
576 : movsd, $$ A5 S" movsd," <single-byte> ;
577 : movs, $$ A5 S" movs," <single-byte> ;
578 : lodsb, $$ AC S" lodsb," <single-byte> ;
579 : lodsd, $$ AD S" lodsd," <single-byte> ;
580 : nop, $$ 90 S" nop," <single-byte> ;
581 : repne, $$ F2 S" repne," <single-byte> ;
582 : repnz, $$ F2 S" repnz," <single-byte> ;
583 : repz, $$ F3 S" repz," <single-byte> ;
584 : popf, $$ 9D S" popf," <single-byte> ;
585 : ret, $$ C3 S" ret," <single-byte> ;
586 : pushf, $$ 9C S" pushf," <single-byte> ;
587 : rep, $$ F3 S" rep," <single-byte> ;
588 : repe, $$ F3 S" repe," <single-byte> ;
589 : scasb, $$ AE S" scasb," <single-byte> ;
590 : scasd, $$ AF S" scasd," <single-byte> ;
591 : stosd, $$ AB S" stosd," <single-byte> ;
592 : stc, $$ F9 S" stc," <single-byte> ;
593 : std, $$ FD S" std," <single-byte> ;
594 : xlat, $$ D7 S" xlat," <single-byte> ;
595 : stosb, $$ AA S" stosb," <single-byte> ;
596 : sahf, $$ E9 S" sahf," <single-byte> ;
597 : WORD: $$ 66 S" WORD:" <single-byte> ;
598 : wait, $$ 9B S" wait," <single-byte> ;
600 \ compile an alu-operation
601 : <alu> ( eax,i32 r32,i32 col r,rm -- )
602 ( OK )
603 need2op
604 (check-sizes)
605 ts@ isimm? IF \ eax,i32 r32,i32 col r,rm
606 DROP \ eax,i32 r32,i32 col
607 td@ isreg? rd@ 0= AND IF
608 2DROP (w+) asm-c, os@ ar?,
609 ELSE
610 td@ isr/m? IF \ eax,i32 r32,i32 col
611 ROT DROP \ r32,i32 col
612 SWAP (w+) asm-c, \ col
613 (rs) !
614 RT-REG (ts) !
615 FALSE (mod/rm),
616 os@ ar?,
617 ELSE
618 (unknown-combination)
619 THEN THEN
620 ELSE \ eax,i32 r32,i32 col r,rm
621 NIP NIP NIP \ r,m
622 td@ isr/m? ts@ isreg? AND IF
623 (w+) asm-c,
624 FALSE (mod/rm),
625 ELSE
626 td@ isreg? ts@ isr/m? AND IF
627 2 + (w+) asm-c,
628 TRUE (mod/rm),
629 ELSE
630 (unknown-combination)
631 THEN THEN THEN
632 (asm-reset) ;
634 : adc, $$ 14 $$ 80 $$ 02 $$ 10 S" adc," <alu> ;
635 : add, $$ 04 $$ 80 $$ 00 $$ 00 S" add," <alu> ;
636 : and, $$ 24 $$ 80 $$ 04 $$ 20 S" and," <alu> ;
637 : cmp, $$ 3C $$ 80 $$ 07 $$ 38 S" cmp," <alu> ;
638 : or, $$ 0C $$ 80 $$ 01 $$ 08 S" or," <alu> ;
639 : sbb, $$ 1C $$ 80 $$ 03 $$ 18 S" sbb," <alu> ;
640 : sub, $$ 2C $$ 80 $$ 05 $$ 28 S" sub," <alu> ;
641 : test, $$ A8 $$ F6 $$ 00 $$ 84 S" test," <alu> ;
642 : xor, $$ 34 $$ 80 $$ 06 $$ 30 S" xor," <alu> ;
644 \ produce a inc/dec
645 : <inc>, ( column -- )
646 S" inc/dec" need1op
647 ts@ isreg? ss@ SZ-32 = AND IF
648 8 * $$ 40 + rs@ + asm-c,
649 ELSE
650 ts@ isr/m? ts@ RT-ABS = OR IF
651 $$ FE (w+) asm-c, SZ-32 (#reg)
652 TRUE (mod/rm),
653 ELSE
654 (unknown-combination)
655 THEN THEN
656 (asm-reset) ;
658 : inc, 0 <inc>, ;
659 : dec, 1 <inc>, ;
661 : sreg=ecx/cl? ( -- flag )
662 ts@ isreg? rs@ 1 = AND ;
664 \ produce a shift
665 : <shift>, ( column -- )
666 ( OK )
667 S" shift" need2op
668 sd@ SZ-32 = IF 1 ELSE 0 THEN TO (wrd)
669 ts@ isimm? td@ isr/m? AND os@ 1 = AND IF
670 $$ D0 (w+) asm-c,
671 RT-REG (ts) !
672 (rs) !
673 FALSE (mod/rm),
674 ELSE
675 ts@ isimm? td@ isr/m? AND IF
676 $$ C0 (w+) asm-c,
677 RT-REG (ts) !
678 (rs) !
679 FALSE (mod/rm),
680 os@ asm-c,
681 ELSE
682 sreg=ecx/cl? td@ isr/m? AND IF
683 $$ D2 (w+) asm-c,
684 RT-REG (ts) !
685 (rs) !
686 FALSE (mod/rm),
687 ELSE
688 (unknown-combination)
689 THEN THEN THEN
690 (asm-reset) ;
692 : rol, 0 <shift>, ;
693 : ror, 1 <shift>, ;
694 : rcl, 2 <shift>, ;
695 : rcr, 3 <shift>, ;
696 : sal, 4 <shift>, ;
697 : shl, 4 <shift>, ;
698 : shr, 5 <shift>, ;
699 : sar, 7 <shift>, ;
701 \ compile a mul, div, neg or not
702 : <mul/neg>, ( column -- )
703 S" mul/neg/div" need1op
704 ts@ isr/m? ss@ SZ-UNKNOWN <> AND IF
705 $$ F6 (w+) asm-c, SZ-32 (#reg) TRUE (mod/rm),
706 ELSE
707 (unknown-combination)
708 THEN
709 (asm-reset) ;
711 : not, 2 <mul/neg>, ;
712 : neg, 3 <mul/neg>, ;
713 : mul, 4 <mul/neg>, ;
714 : imul, 5 <mul/neg>, ;
715 : div, 6 <mul/neg>, ;
716 : idiv, 7 <mul/neg>, ;
718 \ produce an exchange operation
719 : xchg, ( -- )
720 S" xchg," need2op (check-sizes)
721 ts@ isreg? rs@ 0= AND ss@ SZ-32 = AND
722 td@ isreg? AND IF ( xchg eax, reg)
723 rd@ $$ 90 + asm-c,
724 ELSE
725 $$ 86 (w+) asm-c, ts@ ismem? (mod/rm),
726 THEN
727 (asm-reset) ;
729 \ produce a push/pop
730 : <push/pop> ( m32 col rd -- )
731 S" push/pop" need1op
732 ss@ SZ-32 <>
733 IF ." push, and pop, only work on DWORDs." ABORT THEN
734 ts@ ismem? IF \ m32 col rd
735 DROP \ m32 col
736 SZ-32 (#reg) asm-c, TRUE (mod/rm),
737 ELSE
738 ts@ isreg? IF \ m32 col
739 rs@ + asm-c,
740 2DROP
741 ELSE
742 (unknown-combination)
743 THEN THEN (asm-reset) ;
745 : push, $$ FF 6 $$ 50 <push/pop> ;
746 : pop, ( -- )
747 S" pop," need1op
748 ts@ isimm? IF
749 $$ 68 asm-c,
750 os@ ar?,
751 (asm-reset)
752 ELSE
753 $$ 8F 0 $$ 58 <push/pop>
754 THEN ;
756 \ produce a setcc
757 : <setcc> ( cc -- )
758 S" setcc," need1op
759 ss@ SZ-8 <>
760 IF ." setcc, requires a byte operand." ABORT THEN
761 ts@ isr/m? IF
762 $$ 0f asm-c, asm-c, 0 SZ-32 (#reg) TRUE (mod/rm),
763 ELSE
764 (unknown-combination)
765 THEN
766 (asm-reset) ;
768 : seta, $$ 97 <setcc> ;
769 : setae, $$ 93 <setcc> ;
770 : setb, $$ 92 <setcc> ;
771 : setbe, $$ 96 <setcc> ;
772 : setc, $$ 92 <setcc> ;
773 : sete, $$ 94 <setcc> ;
774 : setg, $$ 9F <setcc> ;
775 : setge, $$ 9D <setcc> ;
776 : setl, $$ 9C <setcc> ;
777 : setle, $$ 9E <setcc> ;
778 : setna, $$ 96 <setcc> ;
779 : setnae, $$ 92 <setcc> ;
780 : setnb, $$ 93 <setcc> ;
781 : setnbe, $$ 97 <setcc> ;
782 : setnc, $$ 93 <setcc> ;
783 : setne, $$ 95 <setcc> ;
784 : setng, $$ 9E <setcc> ;
785 : setnge, $$ 9C <setcc> ;
786 : setnl, $$ 9D <setcc> ;
787 : setnle, $$ 9F <setcc> ;
788 : setno, $$ 91 <setcc> ;
789 : setnp, $$ 9B <setcc> ;
790 : setns, $$ 99 <setcc> ;
791 : setnz, $$ 95 <setcc> ;
792 : seto, $$ 90 <setcc> ;
793 : setp, $$ 9A <setcc> ;
794 : setpe, $$ 9A <setcc> ;
795 : setpo, $$ 9B <setcc> ;
796 : sets, $$ 98 <setcc> ;
797 : setz, $$ 94 <setcc> ;
799 \ The local label mechanism is quite simple but useful. The number passed to
800 \ jcond, is the number of the local label which can be used either as a
801 \ forward or backward jump label. Due to space constraints only one forward
802 \ jump can be used for one label, but an unlimited number of backward jumps to
803 \ this label. It is possible to use a label for a fwd jump first and then for
804 \ backward jumps. If you need more than one forward jump to the same place,
805 \ use different labels.
806 CREATE loclabel-tab MAXLOCALLABEL CELLS ALLOT
808 \ The given label number is checked and complained about if wrong.
809 : (chk-label-ind) ( ind -- )
810 MAXLOCALLABEL < INVERT
811 IF ." Label number too high." ABORT THEN ;
813 \ Provide simple access to the labels.
814 : >label ( label -- addr )
815 DUP (chk-label-ind)
816 CELLS loclabel-tab + ;
818 \ An address of 0 for a label means that the label is not used yet. At the
819 \ start of each local label scope this state has to be set.
820 : reset-labels ( -- )
821 MAXLOCALLABEL 0 DO
822 0 I >label !
823 LOOP ;
825 : save-labels ( -- labels n )
826 MAXLOCALLABEL 0 DO
827 I >label @
828 LOOP MAXLOCALLABEL ;
830 : restore-labels ( labels n -- )
831 DROP
832 MAXLOCALLABEL 0 DO
833 MAXLOCALLABEL 1- I - >label !
834 LOOP ;
836 \ Declare a local label. If the label is a forward jump calculate the offset,
837 \ check it and store it in the appropiate place.
838 : $: ( label -- )
839 DUP >label @ 0<> IF ( fwd jmp )
840 DUP >label @ \ label dst-addr
841 asm-here OVER 1+ - \ label dst-addr abs-dist
842 DUP 127 < INVERT
843 IF ." Jump out of bounds." ABORT THEN \ label dst-addr abs-dist
844 SWAP asm-c!
845 THEN
846 asm-here SWAP >label ! ;
848 \ Change the short-branch-target to the give address.
849 : change-$: ( addr label -- )
850 >label ! ;
852 \ Compile a conditional jump.
853 : <jcc>, ( label opcode -- )
854 asm-c, \ label
856 >label @
857 0= IF ( fwd jmp ) \ label
858 asm-here SWAP \ here label
859 >label !
860 0 asm-c,
861 ELSE ( bwd jmp) \ label
862 asm-here 1+ SWAP >label @ \ dst orig
863 - DUP 127 < INVERT
864 IF ." Jump out of bounds." ABORT THEN \ abs-dist
865 NEGATE asm-c,
866 THEN ;
868 : jae, $$ 73 <jcc>, ;
869 : jb, $$ 72 <jcc>, ;
870 : jbe, $$ 76 <jcc>, ;
871 : jc, $$ 72 <jcc>, ;
872 : jcxz, $$ E3 <jcc>, ;
873 : je, $$ 74 <jcc>, ;
874 : jg, $$ 7F <jcc>, ;
875 : jge, $$ 7D <jcc>, ;
876 : jl, $$ 7C <jcc>, ;
877 : ja, $$ 77 <jcc>, ;
878 : jle, $$ 7E <jcc>, ;
879 : jnle, $$ 7F <jcc>, ;
880 : jna, $$ 76 <jcc>, ;
881 : jno, $$ 71 <jcc>, ;
882 : jnae, $$ 72 <jcc>, ;
883 : jnp, $$ 7B <jcc>, ;
884 : jnb, $$ 73 <jcc>, ;
885 : jns, $$ 79 <jcc>, ;
886 : jnbe, $$ 77 <jcc>, ;
887 : jnz, $$ 75 <jcc>, ;
888 : jnc, $$ 73 <jcc>, ;
889 : jo, $$ 70 <jcc>, ;
890 : jne, $$ 75 <jcc>, ;
891 : jp, $$ 7A <jcc>, ;
892 : jng, $$ 7E <jcc>, ;
893 : jpe, $$ 7A <jcc>, ;
894 : jnge, $$ 7C <jcc>, ;
895 : jpo, $$ 7B <jcc>, ;
896 : jnl, $$ 7D <jcc>, ;
897 : js, $$ 78 <jcc>, ;
898 : jz, $$ 74 <jcc>, ;
899 : loopne, $$ E0 <jcc>, ;
900 : loopnz, $$ E0 <jcc>, ;
901 : loopz, $$ E1 <jcc>, ;
902 : loop, $$ E2 <jcc>, ;
903 : loope, $$ E1 <jcc>, ;
904 \ Uncondition short jump.
905 : jmpn, $$ EB <jcc>, ;
907 \ ------------------------------------------------------------------------------
908 \ ---------------------------- floating point words ----------------------------
909 \ ------------------------------------------------------------------------------
911 \ FPU stack items
912 0 CONSTANT st0
913 1 CONSTANT st1
914 2 CONSTANT st2
915 3 CONSTANT st3
916 4 CONSTANT st4
917 5 CONSTANT st5
918 6 CONSTANT st6
919 7 CONSTANT st7
921 \ Check the given FPU register for valid index.
922 : st-range ( st -- )
923 0 8 WITHIN INVERT IF ." Invalid FP-Stack register." ABORT THEN ;
925 \ Compile an FPU operation requiring a mod/rm parameter.
926 : <fop-mod/rm>, ( col op addr len -- )
927 need1op asm-c,
928 SZ-32 (#reg) TRUE (mod/rm),
929 (asm-reset) ;
931 \ Compile an FPU operation without or with implicit parameters.
932 : <fop>, ( oc1 oc2 addr len -- )
933 need0op
934 SWAP asm-c, asm-c,
935 (asm-reset) ;
937 \ Compile an FPU operation with one FPU register as parameter.
938 : <fopst>, ( st oc1 oc2 addr len -- )
939 need0op PLUCK st-range
940 SWAP asm-c, + asm-c,
941 (asm-reset) ;
943 \ Store operations in different formats.
944 : fst32, 2 $$ D9 S" fst32," <fop-mod/rm>, ;
945 : fst64, 2 $$ DD S" fst64," <fop-mod/rm>, ;
946 : fstp32, 3 $$ D9 S" fstp32," <fop-mod/rm>, ;
947 : fstp64, 3 $$ DD S" fstp64," <fop-mod/rm>, ;
948 : fstp80, 7 $$ DB S" fstp80," <fop-mod/rm>, ;
949 : fist16, 2 $$ DF S" fist16," <fop-mod/rm>, ;
950 : fist32, 2 $$ DB S" fist32," <fop-mod/rm>, ;
951 : fistp16, 3 $$ DF S" fistp16," <fop-mod/rm>, ;
952 : fistp32, 3 $$ DB S" fistp32," <fop-mod/rm>, ;
953 : fistp64, 7 $$ DF S" fistp64," <fop-mod/rm>, ;
954 : fbstp, 6 $$ DF S" fbstp," <fop-mod/rm>, ;
956 \ Loads in different formats.
957 : fld32, 0 $$ D9 S" fld32," <fop-mod/rm>, ;
958 : fld64, 0 $$ DD S" fld64," <fop-mod/rm>, ;
959 : fld80, 5 $$ DB S" fld80," <fop-mod/rm>, ;
960 : fild16, 0 $$ DF S" fild16," <fop-mod/rm>, ;
961 : fild32, 0 $$ DB S" fild32," <fop-mod/rm>, ;
962 : fild64, 5 $$ DF S" fild64," <fop-mod/rm>, ;
963 : fbld, 4 $$ DF S" fbld," <fop-mod/rm>, ;
965 \ Other operations to memory.
966 : frstor, 4 $$ DD S" frstor," <fop-mod/rm>, ;
967 : fnsave, 6 $$ DD S" fnsave," <fop-mod/rm>, ;
968 : fnstcw, 7 $$ D9 S" fnstcw," <fop-mod/rm>, ;
969 : fldcw, 5 $$ D9 S" fldcw," <fop-mod/rm>, ;
971 \ Calculations, comparing ops, etc.
972 : fchs, $$ D9 $$ E0 S" fchs," <fop>, ;
973 : fabs, $$ D9 $$ E1 S" fabs," <fop>, ;
974 : f2xm1, $$ D9 $$ F0 S" f2xm1," <fop>, ;
975 : fcos, $$ D9 $$ FF S" fcos," <fop>, ;
976 : fscale, $$ D9 $$ FD S" fscale," <fop>, ;
977 : fsin, $$ D9 $$ FE S" fsin," <fop>, ;
978 : fsincos, $$ D9 $$ FB S" fsincos," <fop>, ;
979 : fsqrt, $$ D9 $$ FA S" fsqrt," <fop>, ;
980 : ftst, $$ D9 $$ E4 S" ftst," <fop>, ;
981 : fxtract, $$ D9 $$ F4 S" fxtract," <fop>, ;
982 : fyl2x, $$ D9 $$ F1 S" fyl2x," <fop>, ;
983 : fyl2xp1, $$ D9 $$ F9 S" fyl2xp1," <fop>, ;
984 : fnstswax, $$ DF $$ E0 S" fnstswax," <fop>, ;
985 : fcompp, $$ DE $$ D9 S" fcompp," <fop>, ;
986 : fld1, $$ D9 $$ E8 S" fld1," <fop>, ;
987 : fldl2t, $$ D9 $$ E9 S" fldl2t," <fop>, ;
988 : fldl2e, $$ D9 $$ EA S" fldl2e," <fop>, ;
989 : fldpi, $$ D9 $$ EB S" fldpi," <fop>, ;
990 : fldlg2, $$ D9 $$ EC S" fldlg2," <fop>, ;
991 : fldln2, $$ D9 $$ ED S" fldln2," <fop>, ;
992 : fldz, $$ D9 $$ EE S" fldz," <fop>, ;
993 : fincstp, $$ D9 $$ F7 S" fincstp," <fop>, ;
994 : frndint, $$ D9 $$ FC S" frndint," <fop>, ;
995 : fxam, $$ D9 $$ E5 S" fxam," <fop>, ;
996 : fninit, $$ DB $$ E3 S" fninit," <fop>, ;
997 : fpatan, $$ D9 $$ F3 S" fpatan," <fop>, ;
998 : fprem, $$ D9 $$ F8 S" fprem," <fop>, ;
999 : fprem1, $$ D9 $$ F5 S" fprem1," <fop>, ;
1000 : fptan, $$ D9 $$ F2 S" fptan," <fop>, ;
1001 : ftst, $$ D9 $$ E4 S" ftst," <fop>, ;
1003 : fld, $$ D9 $$ C0 S" fld," <fopst>, ;
1004 : fmulp, $$ DE $$ C8 S" fmulp," <fopst>, ;
1005 : faddp, $$ DE $$ C0 S" faddp," <fopst>, ;
1006 : fsubp, $$ DE $$ E8 S" fsubp," <fopst>, ;
1007 : fdivp, $$ DE $$ F8 S" fdivp," <fopst>, ;
1008 : ffree, $$ DD $$ C0 S" ffree," <fopst>, ;
1009 : fxch, $$ D9 $$ C8 S" fxch," <fopst>, ;
1010 : fstp, $$ DD $$ D8 S" fstp," <fopst>, ;
1011 : fmul, $$ D8 $$ C8 S" fmul," <fopst>, ;
1012 : fmulr, $$ DC $$ C8 S" fmulr," <fopst>, ;
1013 : fcomp, $$ D8 $$ D8 S" fcomp," <fopst>, ;
1014 : fcom, $$ D8 $$ D0 S" fcom," <fopst>, ;
1015 \ Since I am much too lazy to write a special kind of assembler word for fsub
1016 \ I invented a mnemonic: fssub. It means: float-swap-subtract and is written
1017 \ as: fsub st(i), st.
1018 : fssub, $$ DC $$ E8 S" fssub," <fopst>, ;
1019 \ This is the normal subtraction operator: fsub st, st(i)
1020 : fsub, $$ D8 $$ E0 S" fsub," <fopst>, ;
1021 \ The reverse subtraction with pop.
1022 : fsubrp, $$ DE $$ E0 S" fsubrp," <fopst>, ;
1024 \ Compile a software interrupt.
1025 : int, ( nr -- )
1026 S" int," need0op
1027 $$ CD asm-c, asm-c, (asm-reset) ;
1029 \ The load-effective-address operation. It is most useful with sib-addressing.
1030 : lea, ( -- )
1031 S" lea," need2op
1032 $$ 8D asm-c, TRUE (mod/rm), (asm-reset) ;
1034 \ ==============================================================================
1035 \ =============================== meta assembler ===============================
1036 \ ==============================================================================
1038 \ The register allocator uses the registers eax, ebx, ecx, edx, esi and esi to
1039 \ cache the upper 6 items on the data stack. It is very time expensive to
1040 \ restore the correct possitions of all 6 registers especially if we assume
1041 \ that an average of 3 registers is cached. But caching only one register by
1042 \ default is never a bad idea.
1044 \ At the begin of each word, EAX caches TOS. Any other register is available.
1045 \ Using 32-bit-flat-memory-mode, only near calls are used. EBP is used as the
1046 \ data stack pointer. Therefore an offset is always nessesary. This offset
1047 \ is accumulated when moving data to and from the stack. It is assumed to be
1048 \ zero at the begin and end of each word, therefore EBP points to TOS.
1049 \ In conjunction with the (future) use
1050 \ of inlineable words, this saves one ADD/SUB EBP, 4 in every word using the
1051 \ stack, but introduces on ADD EBP,x in the return. This can be paired with the
1052 \ last store of the register, that have to be flushed to stack or the move
1053 \ to make EAX TOS or the RET itself.
1055 \ Using the registers as virtual TOS+x "stackrobatics" with less than seven
1056 \ registers do not cost any cycle, if no register has to be loaded before or
1057 \ flushed after. Only in case the op. request a special register in a special
1058 \ place, some mov, or xchg, is produced.
1060 \ An example:
1061 \ : test ( a b c -- e )
1062 \ (1) ROT \ b c a 2 cycles for loading ebx, ecx
1063 \ (2) ROT \ c a b no cycle, just changing the
1064 \ the order while compiling
1065 \ (3) + \ c a+b one cycle for the add
1066 \ (4) SWAP \ a+b c no cycle
1067 \ (5) 2* 2* \ a+b 4*c 2 cycles for the shifts
1068 \ (6) + \ a+b+4*c 1 cycle for the add
1069 \ (7) ; 2 cycles for flush + bp ofs
1071 \ In line (1) EBX and ECX are loaded, because ROT asks for 3 used registers.
1072 \ In line (2) only the tables in the compiler are changed, but no code is
1073 \ produced.
1074 \ Lines (3) and (5) perform their calculations by producing one ADD and
1075 \ two SHL's.
1076 \ Line (4) produces no code, just the tables are changed.
1077 \ In Line (7) the consistent state must be accomplished, therefore
1078 \ one register has to be flushed (and EAX may be loaded from an other register, I
1079 \ haven't tracked this) and EBP has to be increased by 8.
1081 \ The registers need numbers to identify them.
1082 0 CONSTANT VREG-EAX \ virtual register numbers for
1083 1 CONSTANT VREG-ECX \ register allocator
1084 2 CONSTANT VREG-EDX
1085 3 CONSTANT VREG-EBX
1086 4 CONSTANT VREG-ESI
1087 5 CONSTANT VREG-EDI
1088 6 CONSTANT #USEREGS
1089 -1 CONSTANT REG-NONE
1091 \ Translate vreg to reg using a look-up table.
1092 CREATE ((vreg>reg)) 0 , 1 , 2 , 3 , 6 , 7 ,
1093 : (vreg>reg) ( vreg -- reg )
1094 DUP REG-NONE =
1095 IF ." Called (vreg>reg) with invalid register." ABORT THEN
1096 CELLS ((vreg>reg)) + @ ;
1098 \ This array contains the state of the register allocator. The cell i contains
1099 \ the number of the register caching TOS+i.
1100 CREATE tos-cache #USEREGS CHARS ALLOT
1101 : cache! ( reg ind -- )
1102 CHARS tos-cache + C! ;
1103 : cache@ ( ind -- reg )
1104 CHARS tos-cache + C@ ;
1106 \ This array contains the numbers of registers that were marked free.
1107 CREATE free-cache #USEREGS CHARS ALLOT
1108 : free! ( reg ind -- )
1109 CHARS free-cache + C! ;
1110 : free@ ( ind -- reg )
1111 CHARS free-cache + C@ ;
1113 \ Number of items in tos-cache
1114 0 VALUE #tos-cache
1116 \ Increase #tos-cache
1117 : (#tc++) #tos-cache 1+ TO #tos-cache ;
1119 \ Number of items in free-cache
1120 0 VALUE #free-req
1122 \ Who are you?
1123 : vreg>name ( vreg -- addr len )
1124 CASE
1125 VREG-EAX OF S" eax" ENDOF
1126 VREG-EBX OF S" ebx" ENDOF
1127 VREG-ECX OF S" ecx" ENDOF
1128 VREG-EDX OF S" edx" ENDOF
1129 VREG-ESI OF S" esi" ENDOF
1130 VREG-EDI OF S" edi" ENDOF
1131 >R S" unknown" R>
1132 ENDCASE ;
1134 \ Mark a register as requested with free.
1135 : (mark-free) ( vreg -- )
1136 #free-req free! #free-req 1+ TO #free-req ;
1138 \ Check if vreg is marked free.
1139 : (#marked) ( vreg -- flag )
1140 #free-req 0 ?DO \ vreg
1141 I free@ OVER = IF
1142 DROP TRUE UNLOOP EXIT
1143 THEN
1144 LOOP
1145 DROP FALSE ;
1147 \ Number of consequtively requested registers in compiler
1148 0 VALUE #reg-req
1150 \ Increase #reg-req .
1151 : (#rr++) #reg-req 1+ TO #reg-req ;
1153 \ Accumulated offset to ebp
1154 0 VALUE offs-ebp
1156 \ Maintain a offs-ebp to be within +/- 124 to fit into a byte.
1157 : (add-ebp) ( n -- )
1158 offs-ebp + \ no
1159 DUP ABS 124 >= IF \ no
1160 ## ebp add,
1162 THEN
1163 TO offs-ebp ;
1165 \ Save the state of the allocator on the stack. No code is produced.
1166 : save-allocator ( -- allocator )
1167 #USEREGS 0 DO I cache@ LOOP
1168 #USEREGS 0 DO I free@ LOOP
1169 #tos-cache #free-req offs-ebp ;
1171 \ Restore the state of the allocator from the stack. No code is produced.
1172 : restore-allocator ( allocator -- )
1173 TO offs-ebp
1174 TO #free-req TO #tos-cache
1175 #USEREGS 0 DO #USEREGS 1- I - free! LOOP
1176 #USEREGS 0 DO #USEREGS 1- I - cache! LOOP ;
1178 \ Print the state of the allocator.
1179 : .regalloc ( -- )
1180 ." Used: "
1181 #tos-cache 0= IF
1182 ." none"
1183 ELSE
1184 #tos-cache 0 DO
1185 #tos-cache 1- I - cache@ vreg>name TYPE ." "
1186 LOOP
1187 THEN
1188 ." Free: "
1189 #free-req 0= IF
1190 ." none"
1191 ELSE
1192 #free-req 0 DO
1193 #free-req 1- I - free@ vreg>name TYPE ." "
1194 LOOP
1195 THEN ." offs: " offs-ebp . CR ;
1197 \ Check whether vreg is used
1198 : (#used) ( vreg -- flag )
1199 #tos-cache 0 ?DO
1200 I cache@ OVER = IF
1201 DROP UNLOOP TRUE EXIT
1202 THEN
1203 LOOP
1204 DROP FALSE ;
1206 \ find the vreg in current requested cache slot
1207 : tc(#rr) ( -- vreg )
1208 #reg-req cache@ ;
1210 \ check whether enough registers are in use
1211 : (#enough) ( -- flag )
1212 #reg-req #tos-cache < ;
1214 \ load the register vreg into current requested slot
1215 : (#load) ( vreg -- )
1216 BOFFS offs-ebp #tos-cache CELLS + [ebp]
1217 DUP (vreg>reg) SZ-32 (#reg) mov,
1218 #tos-cache cache!
1219 #tos-cache 1+ TO #tos-cache ;
1221 \ find the cache-slot vreg is in. Return -1 for non-cached register
1222 : (#find) ( vreg -- nr )
1223 #tos-cache 0 ?DO
1224 I cache@ OVER = IF
1225 DROP I UNLOOP EXIT
1226 THEN
1227 LOOP
1228 DROP -1 ;
1230 \ exchange the meanings and the contents of the regs
1231 \ free(vreg1) used(vreg2) -> mov vreg1, vreg2
1232 \ used(vreg1) free(vreg2) -> mov vreg2, vreg1
1233 \ else -> xchg vreg1, vreg2
1234 : tc-xchg ( vreg1 vreg2 -- )
1235 2DUP = IF 2DROP EXIT THEN
1236 DUP (#used) INVERT IF ( vreg2 free ) \ vreg1 vreg2
1237 2DUP SWAP
1238 (vreg>reg) SZ-32 (#reg)
1239 (vreg>reg) SZ-32 (#reg) mov, \ vreg1 vreg2
1240 SWAP (#find) \ vreg2 tos1
1241 cache!
1242 ELSE
1243 OVER (#used) INVERT IF ( vreg1 free ) \ vreg1 vreg2
1244 2DUP
1245 (vreg>reg) SZ-32 (#reg)
1246 (vreg>reg) SZ-32 (#reg) mov, \ vreg1 vreg2
1247 (#find) \ vreg1 tos2
1248 cache!
1249 ELSE
1250 2DUP
1251 (vreg>reg) SZ-32 (#reg)
1252 (vreg>reg) SZ-32 (#reg) xchg, \ vreg1 vreg2
1253 2DUP (#find) SWAP (#find) \ vreg1 vreg2 tos2 tos1
1254 ROT SWAP \ vreg1 tos2 vreg2 tos1
1255 cache! cache!
1256 THEN
1257 THEN ;
1259 \ Set the bit vreg in mask, if vreg is cached.
1260 \ This word has an environmental dependency. It assumes, that one cell
1261 \ has more than #USEREGS bits.
1262 : (cached-mask) ( -- mask )
1263 0 #tos-cache 0 ?DO
1264 1 I cache@ LSHIFT
1266 LOOP ;
1268 \ Find the first unrequested register in cache or REG-NONE if all are used.
1269 : (unrequested) ( -- vreg )
1270 (cached-mask)
1271 #free-req 0 ?DO
1272 1 I free@ LSHIFT OR
1273 LOOP
1274 #USEREGS 0 DO
1275 DUP 1 I LSHIFT AND 0= IF
1276 DROP I UNLOOP EXIT
1277 THEN
1278 LOOP
1279 DROP REG-NONE ;
1281 \ flush the lowest cache slot to memory and return the free register
1282 : (flushreg) ( -- vreg )
1283 #tos-cache 1- cache@ DUP \ vreg vreg
1284 ( mov [ebp + offs + {#tc-1}*4], vreg )
1285 (vreg>reg) SZ-32 (#reg)
1286 BOFFS #tos-cache 1- CELLS offs-ebp + [ebp] mov,
1287 #tos-cache 1- TO #tos-cache ;
1289 \ request a virtual register by number
1290 \ n is tos + #reg-req
1291 \ condition action
1292 \ --------------------------------------------------------
1293 \ free(vreg) & #rr<#tc xchg(vreg,tc(#rr))
1294 \ free(vreg) & #rr=#tc load(vreg)
1295 \ used(vreg) & #rr<#tc xchg(vreg,tc(#rr))
1296 \ used(vreg) & #rr=#tc & free(vreg2) load(vreg2), xchg(vreg2,vreg)
1297 \ used(vreg) & #rr=#tc & used(vreg2) flush(#ur-1), xchg( #ur-1, vreg)
1298 : (#req) ( vreg -- )
1299 ( OK )
1300 DUP (#used) IF \ vreg
1301 (#enough) IF
1302 tc(#rr) tc-xchg
1303 ELSE ( req. reg in use ) \ vreg
1304 (unrequested) \ vreg vreg2
1305 DUP REG-NONE = IF ( none free )
1306 DROP (flushreg) \ vreg vreg2
1307 ELSE ( 1 free found )
1308 DUP (#load)
1309 THEN
1310 tc-xchg
1311 THEN
1312 ELSE \ vreg
1313 (#enough) IF
1314 tc(#rr) tc-xchg
1315 ELSE
1316 (unrequested) \ vreg vreg2
1317 DUP REG-NONE <> IF
1318 DUP (#load)
1319 tc-xchg
1320 ELSE
1321 DROP
1322 (flushreg) tc-xchg
1323 THEN
1324 THEN
1325 THEN (#rr++) ;
1327 : req-eax VREG-EAX (#req) ;
1328 : req-ebx VREG-EBX (#req) ;
1329 : req-ecx VREG-ECX (#req) ;
1330 : req-edx VREG-EDX (#req) ;
1331 : req-edi VREG-EDI (#req) ;
1332 : req-esi VREG-ESI (#req) ;
1334 \ Is register eax, ebx, ecx or edx?
1335 : (is-a-d) ( vreg -- flag )
1336 DUP VREG-EAX =
1337 OVER VREG-EBX = OR
1338 OVER VREG-ECX = OR
1339 SWAP VREG-EDX = OR ;
1341 \ Request the register if unused.
1342 : (#req-unused) ( vreg -- ok? )
1343 DUP (#used) INVERT DUP \ vreg ok? ok?
1344 IF SWAP (#req) ELSE NIP THEN ;
1346 \ Request any one of eax-edx. Check if one of them is not used. If so request
1347 \ it else check all other requested below if they are eax-edx. If so request
1348 \ it. Else error.
1349 : req-a-d ( -- )
1350 (#enough) IF
1351 #reg-req cache@ (is-a-d) IF
1352 (#rr++) EXIT
1353 THEN
1354 THEN
1355 VREG-EAX (#req-unused) IF EXIT THEN
1356 VREG-ECX (#req-unused) IF EXIT THEN
1357 VREG-EDX (#req-unused) IF EXIT THEN
1358 VREG-EBX (#req-unused) IF EXIT THEN
1359 #tos-cache #reg-req ?DO
1360 I cache@ DUP (is-a-d) \ vreg general?
1361 IF (#req) UNLOOP EXIT ELSE DROP THEN
1362 LOOP
1363 ." Can't request that many general registers." ABORT ;
1365 \ request any virtual register
1366 : req-any ( -- )
1367 (#enough) INVERT IF
1368 (unrequested) \ vreg
1369 \ DUP ." req-any: " . CR
1370 (#load)
1371 THEN (#rr++) ;
1373 \ request any BUT the register vreg
1374 \ n is tos + #reg-req
1375 \ condition action
1376 \ --------------------------------------------------------
1377 \ enough, n<>vreg ---
1378 \ enough, n=vreg, free(vreg2) xchg(vreg,vreg2)
1379 \ enough, n=vreg, none free, #rr<#ur-1 xchg(vreg,#ur-1)
1380 \ enough, n=vreg, none free, #rr=#ur-1 xchg(vreg,0)
1381 \ not enough, used(vreg) load(vreg2)
1382 \ not enough, free(vreg), free(vreg2) load(vreg2)
1383 \ not enough, free(vreg), used(vreg2) xchg(vreg,vreg2), load(vreg2)
1384 : (xchg-not) ( vreg -- )
1385 (unrequested) \ vreg vreg2
1386 DUP REG-NONE <> IF
1387 tc-xchg
1388 ELSE
1389 #reg-req #USEREGS 1- = IF
1391 ELSE
1392 #USEREGS 1-
1393 THEN
1394 cache@ tc-xchg
1395 THEN ;
1397 : (#req-not) ( vreg -- )
1398 (#enough) IF
1399 tc(#rr) \ vreg n
1400 OVER = IF \ vreg
1401 (xchg-not)
1402 ELSE
1403 DROP
1404 THEN
1405 ELSE \ vreg
1406 DUP (#used) IF \ vreg
1408 (unrequested) DUP REG-NONE =
1409 TooManyRegs \ vreg2
1410 (#load)
1411 ELSE \ vreg
1412 (unrequested) DUP REG-NONE <>
1413 IF \ vreg vreg2
1414 (#load) DROP
1415 ELSE
1416 TUCK \ vreg2 vreg vreg2
1417 tc-xchg (#load)
1418 THEN
1419 THEN
1420 THEN (#rr++) ;
1422 : req-not-eax VREG-EAX (#req-not) ;
1424 CREATE a-d-table VREG-EAX , VREG-EBX , VREG-ECX , VREG-EDX ,
1425 : forall-a-d 4 0 ;
1427 \ find the first unmarked register eax, ebx, ecx or edx
1428 \ return nr or -1 if all marked
1429 : (unmarked-a-d) ( -- vreg )
1430 a-d-table
1431 forall-a-d DO \ addr
1432 DUP @ DUP (#marked) INVERT \ addr vreg mark
1434 NIP UNLOOP EXIT
1435 THEN
1436 DROP CELL+
1437 LOOP DROP REG-NONE ;
1439 \ all eax-edx marked
1440 : (a-d-marked) ( -- flag )
1441 (unmarked-a-d) REG-NONE = ;
1443 \ request a free register, but only eax, ebx, ecx or edx
1444 \ cond action
1445 \ ----------------------------------------------
1446 \ a-d marked error
1447 \ a-d req, s-d req flush
1448 \ a-d unreq. mark
1449 \ a-d req, s-d unreq swap, mark
1450 : a-d-free ( -- )
1451 ( OK )
1452 (a-d-marked)
1453 IF ." Can't request this many general registers." ABORT THEN
1454 (unrequested)
1455 DUP REG-NONE = IF \ vreg
1456 DROP (flushreg) \ vreg
1457 THEN
1458 DUP (is-a-d) IF \ vreg
1459 (mark-free)
1460 ELSE \ vreg-s-d
1461 (unmarked-a-d) \ vreg-s-d vreg-a-d
1462 TUCK tc-xchg (mark-free)
1463 THEN ;
1465 \ request a free register
1466 \ cond action
1467 \ ----------------------------------------------
1468 \ all marked error
1469 \ all requested error
1470 \ all cached flush, mark
1471 \ uncached(vreg) mark
1472 : req-free ( -- ) ( OK )
1473 #free-req #reg-req +
1474 #USEREGS = IF ." All registers requested." ABORT THEN
1475 (unrequested) \ vreg
1476 DUP REG-NONE = IF
1477 DROP (flushreg) \ vreg
1478 THEN
1479 (mark-free) ;
1481 \ request a free register by number vreg TODO more efficient
1482 \ cond action
1483 \ ----------------------------------------------------
1484 \ marked(vreg) error
1485 \ unused(vreg) mark
1486 \ unmarked(vreg), cached(vreg), unused(v2) swap, mark
1487 \ unmarked(vreg), all cached, vreg=#ur-1 flush, mark
1488 \ unmarked(vreg), all cached, vreg<>#ur-1 flush, swap, mark
1489 : (#req-free) ( vreg -- )
1490 DUP (#marked) IF ." Register is already marked." ABORT THEN
1491 DUP (#used) INVERT IF
1492 (mark-free)
1493 ELSE \ vreg
1494 (unrequested) \ vreg vreg2
1495 2DUP = IF (internal-error) THEN \ vreg vreg2
1496 DUP REG-NONE <> IF \ vreg vreg2
1497 SWAP TUCK tc-xchg \ vreg
1498 (mark-free)
1499 ELSE \ vreg vreg2
1500 DROP
1501 DUP (#find) \ vreg nr
1502 DUP -1 = IF (internal-error) THEN
1503 #USEREGS 1- OVER = IF
1504 (flushreg) \ vreg vreg
1505 2DUP <> IF (internal-error) THEN
1506 DROP (mark-free)
1507 ELSE
1508 (flushreg) \ vreg vreg2
1509 SWAP TUCK tc-xchg (mark-free)
1510 THEN
1511 THEN
1512 THEN ;
1514 : free-eax VREG-EAX (#req-free) ;
1515 : free-ecx VREG-ECX (#req-free) ;
1516 : free-edx VREG-EDX (#req-free) ;
1517 : free-edi VREG-EDI (#req-free) ;
1518 : free-esi VREG-ESI (#req-free) ;
1520 \ swap the vregs in tos+n1 and tos+n2
1521 : tos-swap ( n1 n2 -- )
1522 2DUP #reg-req < SWAP #reg-req < AND
1523 INVERT IF ." Too few registers requested." ABORT THEN
1524 2DUP \ n1 n2 n1 n2
1525 cache@ SWAP cache@ \ n1 n2 r2 r1
1526 ROT \ n1 r2 r1 n2
1527 cache! SWAP cache! ;
1529 \ drop the tos+0
1530 : (reg-free) ( -- )
1531 #tos-cache DUP 0= IF ." No register in cache." ABORT THEN
1532 1- DUP TO #tos-cache
1534 BEGIN \ end curr
1535 DUP 1+ cache@ \ end curr vreg
1536 OVER cache! \ end curr
1537 1+ 2DUP <=
1538 UNTIL 2DROP
1539 4 (add-ebp) ;
1541 \ free n times the tos+0
1542 : reg-free ( n -- )
1543 0 ?DO
1544 (reg-free)
1545 LOOP ;
1547 \ put the register in free-cache+n on top of stack
1548 : free>tos ( n -- )
1549 DUP #free-req >= IF ." Too few free registers requested." ABORT THEN
1550 free@ \ vreg
1551 ( make space for register )
1552 #tos-cache BEGIN \ vreg i
1553 1- DUP 0< INVERT
1554 WHILE \ vreg i-1
1555 DUP cache@ OVER 1+ cache!
1556 REPEAT DROP
1557 0 cache!
1558 -4 (add-ebp)
1559 (#tc++) ;
1561 \ reset register allocator ( at start of compiler word )
1562 : regalloc-reset ( -- )
1563 0 TO #reg-req 0 TO #free-req reset-labels ;
1565 \ initialize register allocator ( at start of word compilation)
1566 : regalloc-init ( -- )
1567 1 TO #tos-cache
1568 VREG-EAX 0 cache!
1569 0 TO offs-ebp ;
1571 \ create a consistent state before ret/call
1572 : regalloc-flush ( -- )
1573 #tos-cache 0= IF
1574 req-eax
1575 ELSE
1576 \ make sure only reg is loaded
1577 BEGIN
1578 #tos-cache 1 <>
1579 WHILE
1580 (flushreg) DROP
1581 REPEAT
1582 \ make sure it is eax
1583 0 cache@ VREG-EAX <> IF
1584 0 cache@ VREG-EAX tc-xchg
1585 THEN
1586 THEN
1587 offs-ebp IF
1588 offs-ebp ## ebp add,
1589 THEN
1590 regalloc-init ;
1592 \ flush all registers to stack and correct ebp
1593 : regalloc-flushjmp ( -- )
1594 BEGIN
1595 #tos-cache 0<>
1596 WHILE
1597 (flushreg) DROP
1598 REPEAT
1599 offs-ebp IF
1600 offs-ebp ## ebp add,
1601 THEN 0 TO offs-ebp ;
1603 \ flush all except 2 registers to stack and correct ebp
1604 : regalloc-flush-do ( -- )
1605 BEGIN
1606 #tos-cache 2 >
1607 WHILE
1608 (flushreg) DROP
1609 REPEAT
1610 offs-ebp -8 <> IF
1611 offs-ebp 8 + ## ebp add,
1612 THEN
1613 -8 TO offs-ebp
1616 \ access to meta-register
1617 : (tosn) ( n -- )
1618 DUP #tos-cache >= IF ." Request more registers." ABORT THEN
1619 cache@ (vreg>reg) SZ-32 (#reg) ;
1621 : tos0 0 (tosn) ;
1622 : tos1 1 (tosn) ;
1623 : tos2 2 (tosn) ;
1624 : tos3 3 (tosn) ;
1625 : tos4 4 (tosn) ;
1626 : tos5 5 (tosn) ;
1628 : ([tosn]) ( offs n -- )
1629 DUP #tos-cache >= IF ." Request more registers." ABORT THEN
1630 cache@ (vreg>reg) (#[reg]) ;
1632 : [tos0] 0 ([tosn]) ;
1633 : [tos1] 1 ([tosn]) ;
1634 : [tos2] 2 ([tosn]) ;
1635 : [tos3] 3 ([tosn]) ;
1636 : [tos4] 4 ([tosn]) ;
1637 : [tos5] 5 ([tosn]) ;
1639 : (freen) ( n -- )
1640 DUP #free-req >= IF ." Request more free registers." ABORT THEN
1641 free@ (vreg>reg) SZ-32 (#reg) ;
1643 : free0 0 (freen) ;
1644 : free1 1 (freen) ;
1645 : free2 2 (freen) ;
1646 : free3 3 (freen) ;
1647 : free4 4 (freen) ;
1648 : free5 5 (freen) ;
1650 : ([freen]) ( offs n -- )
1651 DUP #free-req >= IF ." Request more free registers." ABORT THEN
1652 free@ (vreg>reg) (#[reg]) ;
1654 : [free0] 0 ([freen]) ;
1655 : [free1] 1 ([freen]) ;
1656 : [free2] 2 ([freen]) ;
1657 : [free3] 3 ([freen]) ;
1658 : [free4] 4 ([freen]) ;
1659 : [free5] 5 ([freen]) ;
1661 : (vreg>reg_l) ( vreg -- )
1662 DUP (is-a-d) INVERT IF ." Can't get lower part of edi or esi." ABORT THEN
1663 CASE
1664 VREG-EAX OF reg-al ENDOF
1665 VREG-EDX OF reg-dl ENDOF
1666 VREG-ECX OF reg-cl ENDOF
1667 VREG-EBX OF reg-bl ENDOF
1668 ENDCASE ;
1670 : (vreg>reg_h) ( vreg -- )
1671 DUP (is-a-d) INVERT IF ." Can't get higher part of edi or esi." ABORT THEN
1672 CASE
1673 VREG-EAX OF reg-ah ENDOF
1674 VREG-EDX OF reg-dh ENDOF
1675 VREG-ECX OF reg-ch ENDOF
1676 VREG-EBX OF reg-bh ENDOF
1677 ENDCASE ;
1679 \ n is the nr of the 32 bit virtual register
1680 : (free_l) ( n -- )
1681 DUP #free-req >= IF ." Request more free registers." ABORT THEN
1682 free@ (vreg>reg_l) ;
1684 : (free_h) ( n -- )
1685 DUP #free-req >= IF ." Request more free registers." ABORT THEN
1686 free@ (vreg>reg_h) ;
1688 : free0l 0 (free_l) ;
1689 : free1l 1 (free_l) ;
1690 : free2l 2 (free_l) ;
1691 : free3l 3 (free_l) ;
1692 : free4l 4 (free_l) ;
1693 : free5l 5 (free_l) ;
1694 : free0h 0 (free_h) ;
1695 : free1h 1 (free_h) ;
1696 : free2h 2 (free_h) ;
1697 : free3h 3 (free_h) ;
1698 : free4h 4 (free_h) ;
1699 : free5h 5 (free_h) ;
1701 : (tos_l) ( n -- )
1702 DUP #tos-cache >= IF ." Request more registers." ABORT THEN
1703 cache@ (vreg>reg_l) ;
1705 : (tos_h) ( n -- )
1706 DUP #tos-cache >= IF ." Request more registers." ABORT THEN
1707 cache@ (vreg>reg_h) ;
1709 : tos0l 0 (tos_l) ;
1710 : tos1l 1 (tos_l) ;
1711 : tos2l 2 (tos_l) ;
1712 : tos3l 3 (tos_l) ;
1713 : tos4l 4 (tos_l) ;
1714 : tos5l 5 (tos_l) ;
1715 : tos0h 0 (tos_h) ;
1716 : tos1h 1 (tos_h) ;
1717 : tos2h 2 (tos_h) ;
1718 : tos3h 3 (tos_h) ;
1719 : tos4h 4 (tos_h) ;
1720 : tos5h 5 (tos_h) ;
1722 \ Scale index base addressing. All words are ( offs -- )
1723 : [tos0+tos1] SC-1 0 cache@ (vreg>reg) 1 cache@ (vreg>reg) (#[sib]) ;
1724 : [4*tos0+tos1] SC-4 0 cache@ (vreg>reg) 1 cache@ (vreg>reg) (#[sib]) ;
1726 \ ==============================================================================
1727 \ =============================== compiler support =============================
1728 \ ==============================================================================
1730 \ Produce a near jump and put the address of the cell with the distance on the
1731 \ stack. Due to the fact that these jumps are relative no relocation is
1732 \ nessesary.
1733 : fwd-jmp ( xt -- addr )
1734 0 ## EXECUTE asm-here 4 - ;
1736 \ Resolve the jump to this address.
1737 : resolve-jmp ( fwd-addr -- )
1738 asm-here \ addr here
1739 over 4 + - \ addr rel
1740 SWAP asm-! ;
1742 \ Save the allocator state in the returned dyn-array.
1743 : allocator-state ( addr -- )
1744 #tos-cache OVER C! CHAR+ \ addr
1745 #tos-cache 0 ?DO \ addr
1746 I cache@ \ addr reg
1747 OVER C! CHAR+
1748 LOOP
1749 offs-ebp SWAP C!
1752 \ Set the bit with the number vreg for each register vreg in the state.
1753 : (state-mask) ( state #regs -- mask )
1754 0 -ROT 0 ?DO \ mask state
1755 DUP I CHARS + C@ \ mask state vreg
1756 1 SWAP LSHIFT \ mask state vrmask
1757 ROT OR SWAP \ mask state
1758 LOOP DROP ;
1760 \ Find the lowest bit set in x and return it's number.
1761 : lowest-bit ( x -- nr )
1762 #USEREGS 0 DO \ x
1763 1 I LSHIFT OVER AND \ x reg?
1764 IF \ x
1765 DROP I UNLOOP EXIT
1766 THEN
1767 LOOP DROP REG-NONE ;
1769 \ Try to load a register that is in state but not in cache. There must be at
1770 \ least one of them since (alloc-load) is called only with fewer regs in cache
1771 \ than in state.
1772 : ((alloc-load)) ( state #regs -- )
1773 ( find regs in cache )
1774 (cached-mask) \ state #regs cache-mask
1775 ( find regs in state )
1776 -ROT (state-mask) \ cache-mask state-mask
1777 ( leave all flags that are in state-mask AND NOT in cache-mask )
1778 SWAP INVERT AND \ load-mask
1779 DUP 0= IF (internal-error) THEN \ load-mask
1780 lowest-bit \ vreg
1781 (#load) ;
1783 \ Load as many registers as nessesary. Try to use those that needed.
1784 : (alloc-load) ( state #regs -- state #regs )
1785 BEGIN
1786 #tos-cache \ state #regs #tc
1787 OVER \ state #regs #tc #regs
1789 WHILE
1790 2DUP ((alloc-load))
1791 REPEAT
1794 \ Flush some register till the same number as in state in reached.
1795 : (alloc-flush) ( #regs -- #regs )
1796 BEGIN
1797 #tos-cache OVER \ #regs #tc #regs
1799 WHILE
1800 (flushreg) DROP
1801 REPEAT ;
1803 \ Exchange vreg and cache(ind) if nessesary.
1804 : (alloc-adjust) ( vreg ind -- )
1805 cache@ \ vreg creg
1806 2DUP = IF
1807 2DROP
1808 ELSE
1809 tc-xchg
1810 THEN ;
1812 \ Perform sign extension from a byte to a cell.
1813 : (sign-extend) ( c -- n )
1814 DUP 128 AND \ c sign-bit
1815 0<> 255 INVERT AND OR
1818 \ Retrieve the save offset and correct ebp if nessesary. If save-flags? is
1819 \ true, the CPU flags are saved before and restored afterwards ( add could
1820 \ change them).
1821 : (fix-offs) ( save-flags? state ind -- )
1822 CHARS + C@ (sign-extend) \ save? dest-offs
1823 offs-ebp OVER = IF
1824 2DROP
1825 ELSE \ save? dest-offs
1826 OVER IF pushf, THEN \ save? dest-offs
1827 offs-ebp OVER -
1828 ## ebp add,
1829 TO offs-ebp \ save?
1830 IF popf, THEN
1831 THEN ;
1833 \ Rebuild the allocator to the given state. The index of the user data in
1834 \ state is returned.
1835 \ Algo:
1836 \ 1. too few regs cached? -> load registers
1837 \ 2. too many regs cached? -> flush registers
1838 \ 3. exchange regs
1839 : allocator-rebuild ( save-flags? state -- )
1840 DUP C@ SWAP CHAR+ SWAP \ save? state #regs
1841 (alloc-load)
1842 (alloc-flush) \ save? state #regs
1843 0 SWAP \ save? state ind #regs
1844 0 ?DO \ save? state ind
1845 2DUP CHARS + C@ \ save? state ind reg(ind)
1846 OVER (alloc-adjust) \ save? state ind
1848 LOOP \ save? state ind
1849 (fix-offs) ;
1851 \ Store the state in the allocator without generating code.
1852 : allocator-store ( state -- )
1853 DUP C@ \ state #regs
1854 DUP TO #tos-cache
1855 1 CHARS SWAP \ state ind #regs
1856 0 ?DO \ state ind
1857 2DUP CHARS + C@ \ state ind vreg
1858 I cache! \ state ind
1859 CHAR+
1860 LOOP \ state ind
1861 + C@ (sign-extend) TO offs-ebp ;