assets/developer-notes/stephanie-gawroriski/2016/03/20.mkd

   1 # 2016/03/20
   2
   3 ## 00:12
   4
   5 Refactor is going well, things are much nicer now.
   6
   7 ## 00:20
   8
   9 First day of spring also.
  10
  11 ## 15:21
  12
  13 This refactored code is much simpler now and likely much faster too.
  14
  15 ## 15:57
  16
  17 I believe I have a short or too long of a read somewhere.
  18
  19 ## 16:04
  20
  21 Based on the dump and the values I am seeing, I am short two bytes.
  22
  23 ## 16:07
  24
  25 And looking at the hex dump of the class, those missing two bytes are for
  26 the ignored attribute, since the length is an integer and not an unsigned
  27 short.
  28
  29 ## 16:15
  30
  31 Method handles will require slightly more complex work to make them
  32 cachable because identifiers can contain `(` and `)`. They are also not in
  33 fixed positions. So I suppose what I will do then is have a fixed offset and
  34 length similar to the binary name, then substring as required.
  35
  36 ## 18:49
  37
  38 `sloccount` gives me `java: 13646 (100.00%)`. Also, the parsing of the block
  39 of code will be inline so no byte arrays have to be filled by the attribute
  40 and then parsed that way. Thus, the class parsing code will remain light and
  41 not require too heavy allocations beyond the normal allocations. During my
  42 pass, I can also use what is used by the method and ignore stuff that is not
  43 used. Since the constant pool is shared with everything, some stuff might not
  44 be needed at all for specific methods.
  45
  46 ## 19:01
  47
  48 The good thing is that the subroutine byte codes are completely unsupported,
  49 which means that parsing classes is simpler and I less likely have to worry
  50 about recursive states where a byte code can have a few thousand different
  51 states.
  52
  53 ## 19:05
  54
  55 Looks like `StackMap` is a bit less compact version of the `StackMapTable`.
  56 That is, the information appears to be similar so far, except that it is
  57 simpler to parse. Basically think of if everything in the `StackMapTable`
  58 were entire frames. I see the very familiar `verification_type_info`. And
  59 reading the specification, the type information is exactly the same. So when
  60 writing the checker, I can combine and check for both states accordingly. This
  61 means that `StackMap` and `StackMapTable` will be folded into entire full
  62 states which are then checked on input instructions.
  63
  64 ## 21:34
  65
  66 Thinking about it, the `StackMapTable` might not even be needed. I could
  67 technically skip it if I force the requirements as if `StackMapTable` were
  68 there. The way Java byte code runs on moderm VMs is that each address in the
  69 byte code operations must have only a single valid state. The Java compiler
  70 enforces this. However, I would have to check if older J2ME classes also share
  71 this property. Ignoring the stack map table would simplify things greatly
  72 because I would not need to parse it at all. So there would quite literally
  73 only be two attributes. However, if a `StackMapTable` is illegal then the
  74 normal VM would fail. However, the table does help in cases where exceptions
  75 are handled to know if local variables and such are filled or not. As for
  76 the byte code, I am going to transform it to a kind of register based format
  77 which will be used by the interpreter and in the future the native recompiler.
  78 Ignoring the table would be faster and would produce code faster. It is kind
  79 of redundent, although skipping it would violate the specification.
  80
  81 ## 22:07
  82
  83 So the thing to do would be to generate a variant of the Java byte code which
  84 uses virtual registers instead of a stack. With registers, it simplifies the
  85 operation of execution and translation to generally register based machines.
  86 For speed and simplicity I would likely not choose SSA form (register coloring
  87 is fun). There would be basic operations, where then certain operations are
  88 instead just compacted (`push 2 r2`, `push 2 r3`, followed by `add`) into a
  89 single form (`add r2 r3 r4`).
  90
  91 ## 22:12
  92
  93 So I will need a kind of linear program editor which can act on a bunch of
  94 buffers to store the register based code. I am not going to use a large number
  95 of classes to represent single operations. Essentially it would be like a
  96 `ListIterator` where there is a current instruction pointer which points to
  97 register instruction code. I do however have to have a compact operation form
  98 however to not waste space and such. RISC would be likely the best choice. I
  99 should also probably have a fixed instruction size for simplicity where they
 100 all share the same form. This would be something similar to for example MMIX.
 101 Alternatively, I could just use MMIX operations to represent my programs and
 102 then use that as a IL. MMIX is a 64-bit instruction set however. MMIX is
 103 however big endian. It also implements IEEE 754 which should match Java.
 104
 105 ## 22:19
 106
 107 However MMIX is really for a real CPU, while I just need an intermediate
 108 language which is speedy and compact. It needs to be compact because there
 109 may be a JIT which is running on systems with only for example 64KiB of memory
 110 available. With SquirrelJME in memory there might only be 32KiB. So the code
 111 would have to be compiled to native code very compactly.
 112
 113 ## 22:22
 114
 115 Time to do some thinking.
 116
 117     111111100 00000000
 118     654321098 76543210
 119     ========= ========
 120     ooooodddd ddssssss
 121
 122 If I have 5-bits for opcodes then I can have 32 instructions, which is quite
 123 small. Then with 6-bit operands I have 64 registers. Using a source and
 124 destination would be compact. Thus add would be `d = s + d` for example. I
 125 should however have an "infinite" number of registers to choose from however.
 126
 127 ## 22:42
 128
 129 I suppose I can design something novel then as an intermediate code. I can use
 130 a prefix so to speak as a kind of huffman kind of detail. However, if I were
 131 to use 3-byte operations, things might get simpler.
 132
 133 ## 23:07
 134
 135 Here is an idea. It would register based, but a CISC (for compactness).
 136 Basically all operations would start at a single byte. 6-bits are used for the
 137 operand (add, subtract, etc.) while 2 bits are used for flags which specify
 138 registers to use. There are two register slots, A and B. When a flag is set
 139 in a register then another byte is expected which is used to set the new A or
 140 B register. The next operation which lacks the bits will use the register set
 141 by the previous operation. Operations such as new, anewarray, multianewarray,
 142 and a few others will just be method calls (since that is sane) or traps. I
 143 would limit the number of active registers to around 256 though. However if
 144 methods are that complex that they use so many registers then they would have
 145 to fail due to the exceeded limitations. This would mean that methods which
 146 are gigantic and extremely complex (probably horrible code) would not work.
 147 However, this environment is made for embedded systems so it would be insane
 148 to have code such as that.
 149
 150 ## 23:13
 151
 152 One problem though are branches going back and forward. The implied register
 153 may vary between calls, so I suppose the thing to do would be to instead have
 154 a linear fallback to find the next implicative register. However, seeking
 155 backwards will be complicated because it might not be known if an instruction
 156 is a register specification or not. So I suppose to fix that, I would then
 157 limit the instruction ID to 5-bits and registers to 127. If the uppermost bit
 158 is set then it specifies an instruction, otherwise it is a register
 159 implication. This could also be used for consistency checking to make sure
 160 instructions are valid. The next thing to consider is which 32 instructions
 161 are used to represent operations. I suppose what I should do is write a
 162 documentation for the intermediate language format. That would keep the
 163 details together and allow for any future work around it.
 164
 165 ## 23:26
 166
 167 I should perhaps make a poll as to how many registers are truly needed. I also
 168 need to consider floating point too.
 169
 170