assets/developer-notes/stephanie-gawroriski/2016/07/02.mkd

   1 # 2016/07/02
   2
   3 ## 11:53
   4
   5 This weeken is holiday weekend, with Monday being the 4th of July.
   6
   7 ## 12:03
   8
   9 I should rework the documentation a bit and have a completely standalone port
  10 section of sorts. Then I can have user and developer bits in their specific
  11 sections.
  12
  13 ## 14:28
  14
  15 One thing I need when it comes to the class file decoder is remembering and
  16 storing the class flags for potential usage later. I will need to document
  17 the blob format in a way where it allows blobs to be output without needing
  18 future details. So as such this means that the table of contents in a blob will
  19 be last.
  20
  21 ## 14:36
  22
  23 I should have a class which can be given a byte buffer or some other class
  24 which is used to read from say an `int[]` or `byte[]` array to access the
  25 details within a blob.
  26
  27 ## 15:36
  28
  29 As alternative to a table of contents kind of thing, I can have a linked list
  30 of sorts through the executable. However, backlinks would not operate at all.
  31 I would say that for simplicity, the blob can be directly memory mapped and
  32 have its structure accessed directly. Also it may be reasonable to have a case
  33 where there are two binaries, one which contains the raw data and another
  34 which contains the table of contents. If the table of contents remains apart
  35 from the executable code they can be linked into the binary using different
  36 means. Alternatively, instead of an `OutputStream` passed into the SSJIT,
  37 there is instead a result of a compilation. So this "smarter" class would have
  38 stuff such as "create new field" and other such things. Then an implementation
  39 of the given writer can be used. Then this way the `SSJIT` is not locked to
  40 a single output format but one which could be wrapped using multiple means.
  41
  42 ## 15:46
  43
  44 Then if the output format is not that nice, it can completely be replaced with
  45 a better one without changing any code. So I could literally have an output
  46 which writes ELF binaries.
  47
  48 ## 15:51
  49
  50 Essentially what could happen for example when it comes to Linux, is that
  51 classes could be compiled to ELFs, then when they need to be opened they can
  52 be linked in as such. Although, I am not sure if Linux supports dynamic linking
  53 of libraries that exist only in memory. Looking into it, it does not. So when
  54 it comes to generation, I will need to use blobs and such.
  55
  56 ## 15:58
  57
  58 When it comes to the output, I should support multiple classes being output
  59 into a single `SSJITOutput`. This way when working with the initial SquirrelJME
  60 binary which the user would use, all classes exist and are pre-merged into it.
  61 The native code would have to be generated in a way where all code acts
  62 together as a single unit. So in this case, entire JARs will be merged into a
  63 single fragment. However, I will need to devise a means where there can be
  64 multiple namespaces within the output (for multi-JAR support).
  65
  66 ## 16:01
  67
  68 For example, for the ELF format all symbols and such will be generated and
  69 placed in an output ELF file. Although doing this within an ELF may be complex
  70 because there may be some things which are unknown (such as how large a given
  71 section is).
  72
  73 ## 16:06
  74
  75 So I suppose for simplicity, keep with the blobs and instead of a container.
  76 Ther container would be the ELF which loads and initializes the classes
  77 stored within the blob. But still allow for multiple JARs and resources to be
  78 namespaced in a single blob.
  79
  80 ## 17:03
  81
  82 Thinking about it, when it comes to generating the bootstrap code, it will be
  83 very difficult to have a split apart JIT when it comes to building the initial
  84 binary which would generate the machine code. Another consideration is the
  85 number of objects that will need to be initialized for every class. So I
  86 suppose I need a separated native code generator, which there is a single
  87 instance for (with functions as I currently have it) where it is then attached
  88 and associated with a state tracker. This way there is a single instance, if
  89 multiple functions need to interact with the same state, they can use the
  90 passed state tracker instead.
  91
  92 ## 17:07
  93
  94 So `SSJIT` is given a `NCGManager`. That `NCGManager` will then create any
  95 associated output with a given state and singular set of instances for
  96 functions.
  97
  98 ## 17:11
  99
 100 Also going to place anything related to the JIT in
 101 `net.multiphasicapps.squirreljme.jit`, then branch from that for example. The
 102 code generators will be a bit higher level. I would suppose below the
 103 code generator there would be the assembler.
 104
 105 ## 17:14
 106
 107 Then this way, when it comes to generating native code I can have similar means
 108 of generating. I will have variants, but instead of operating system
 109 modifications of functions, I instead will have modifications of the code
 110 generator. The assembler, code generator, and JIT should at best be of a single
 111 state which will enable only a single assembler or other instance to exist at
 112 a time. This would reduce the allocation count. I will suppose that for the
 113 assembler, it will be given an output stream. The only consideration is that
 114 there may be the potential for sub-variants of variants (such as big endian
 115 and such).
 116
 117 ## 17:19
 118
 119 So what I will need is a means where I can easily specify all of the flags
 120 which may change how an assembler operates, such as which features are
 121 available and such. I suppose what I can do is have a standard setup of sorts.
 122 There would be the variant, but that would define a standard set of flags
 123 which are supported by a system. So `powerpc32+g4` will turn into
 124 `powerpc32+g4,altivec,big`. A variant will by default give a set of flags and
 125 otherwise which are set to on or off. So the **big** flag for big endian would
 126 be set by default, so having a suppose `~` before it so it is **~big** would
 127 disable it, even if it were on by default with a given variant. The `generic`
 128 variant would just choose a set of flags to use. So I suppose for simplicity,
 129 a given CPU could have their variants mapped to enumeration entries
 130 potentially as a kind of additive set of extra variants and such.
 131
 132 ## 17:28
 133
 134 I would suppose instead of this, that I should have an assembler configuration
 135 which is used to initialize the assembler.
 136
 137 ## 17:48
 138
 139 Actually, having it where the assembler has add operations with types and such
 140 will be a bit complex. What I could have instead is a kind of pipeline of
 141 sorts similar to Java 8's streams. So basically there will be register
 142 selection, essentially something such as `source(RegisterType.INT, "r1")` which
 143 will act as the source register. Then there will be the same for the
 144 destination. Then following that, there will be an execution which performs the
 145 operation (such as an `add`). The native part of the assembler can check if the
 146 given operation is valid between two given registers.
 147
 148 ## 17:55
 149
 150 I will also need some kind of native assembly provider functions, ones that
 151 could be given a single instance, where they are given the `Assembler` where
 152 options can be read from (such as the selected source/dest registers), the
 153 configuration, and the output stream. However the constant handling of this
 154 would be a bit ugly in a way. So I suppose the assembler becomes abstract.
 155 However a problem with this is for each architecture I could only ever have a
 156 single assembler. The assembler would never be patchable and third parties
 157 would never be able to add support for an unsupported CPU.
 158
 159 ## 17:59
 160
 161 So I suppose something similar as I have previously talked about. Basically I
 162 will have an `ASMGenlet` which would be an interface. Multiple genlets could be
 163 created and have a given priority. The first argument to the genlet would be
 164 the assembler itself. However, that would create much complexity. There would
 165 be multiple instances of genlets created for every method. This definitely
 166 would not be fast. Then genlets in every case would have to check which CPU
 167 was selected for code generation. So when it comes to code generation, it will
 168 definitely not be small, nor will it be fast, nor will it be simple. I suppose
 169 I am thinking of far too complex of a solution. So instead, just have an
 170 `Assembler` with the source and destination as I have thought up of. If done
 171 correctly, I can even have it where a single assembler could be created and
 172 shared across all JIT instances (there would need to be a state reset). Since
 173 some operations might not be supported by a given CPU, I would need to have
 174 failure cases that throw an unsupported assembly operation.
 175
 176 ## 18:08
 177
 178 I can then layer the code generator on top which performs virtualized
 179 operations in the event the assembler does not support it (so if hardware float
 180 is not supported, then it will be handled using software instead by replacing
 181 the code with method calls potentially). However, the layering of the JIT goes
 182 deep through all levels. Having three levels will keep things complicated. So
 183 instead of three things, the JIT should be merged into one. When it comes to
 184 support for native architectures I can have a generator class and a set of
 185 registers which are supported by a given CPU. The operating system variant
 186 could change how registers are used and such. Although it is still complex.
 187
 188 ## 18:14
 189
 190 One thing I do not want to have is a few thousand different implementations of
 191 the native JIT for a given CPU. I can have an architecture specific one for
 192 a given operating system however. But generally I only want a single for
 193 each architecture, so the PowerPC one would target only PowerPC and there would
 194 be no other implementation of a PowerPC assembler.
 195
 196 ## 18:16
 197
 198 Ultimately I could support only a single CPU variant and potentially if it has
 199 floating point or not. I could ignore vectorization (that is a whole complex
 200 another subject anyway). So generated code for `PowerPC` would quite literally
 201 target the _G1 (603)_ processor for the most part (the original). However some
 202 CPU architecture have better instructions in later generations or have
 203 removed older ones.
 204
 205 ## 18:19
 206
 207 So right now I am going in circles. I want a small and fast JIT, but one that
 208 is simple to write and does not consist of overly complex code. So basically
 209 something similar to `SSJIT` except there is one instance of it and it is
 210 given an input class to transform (instead of one instance per class). It
 211 would not be multi-threaded capable. I would also say that support for a given
 212 CPU and its variant can be used. Support for a given architecture would extend
 213 the actual JIT class to provide the native functionality. To keep internal
 214 details hidden, package private will be used in many places. This way I can
 215 have the class decoder be external and call into the JIT.
 216
 217 ## 18:39
 218
 219 Actually a single instance `JIT` would be very ugly.
 220
 221 ## 18:55
 222
 223 I can have a slight JIT modifier be used which if found will be passed a JIT.
 224 It can then see which instance the JIT is and possibly perform register banning
 225 or other minor tweaks.
 226