assets/developer-notes/stephanie-gawroriski/2014/11/07.mkd

   1 # 2014/11/07
   2
   3 ***DISCLAIMER***: _These notes are from the defunct k8 project which_
   4 _precedes SquirrelJME. The notes for SquirrelJME start on 2016/02/26!_
   5 _The k8 project was effectively a Java SE 8 operating system and as such_
   6 _all of the notes are in the context of that scope. That project is no_
   7 _longer my goal as SquirrelJME is the spiritual successor to it._
   8
   9 ## 03:27
  10
  11 Thinking about it, I will do this. First the classes will be translated to a
  12 nice form that is easier to translate (class metrics and such). After that
  13 they can either be dumped directly to KBF or other things for special kernel
  14 usage. For the special usage, magical stuff will be activated and made so that
  15 it works. I figure for my kernel design there will be a core hypervisor and
  16 the kernel will just be another hypervisor. The core hypervisor will do
  17 whatever it wants so when a sub-process such as the kernel requests some
  18 memory it will give it freely. However the core hypervisor will be quite
  19 limited in that it will lack support for filesystems and such, but it will
  20 permit access to whatever runs below it to system resources. So the core
  21 hypervisor will have less system calls in the IPC, very basic stuff. The
  22 kernel will handle threading and can control the CPU via the core hypervisor.
  23 The core hypervisor will need process and thread management in it however, so
  24 it knows where to direct IPC calls made between userspace processes and
  25 kernels. So when a kernel spawns a new subprocess it will own that. A
  26 subkernel running on the core will not have the capability to kill another
  27 kernel on the same level, but could kill any subprocess or thread that it owns
  28 (the kernel is a process). The core will have simple process management and
  29 can slice kernels time wise, where the kernel will have to schedule itself to
  30 run or other tasks. So multiple kernels can schedule their tasks at once.
  31 However a kernel running below a kernel (in a virtual machine) would be
  32 incapable of forcing scheduling of tasks so that it has less control (it could
  33 kill them and change their properties but it could not force a subprocess to
  34 be run ahead of a parent or sibling process). However, the kernel can just
  35 allocate slices based on what is runnable. So say if two threads need to be
  36 run and another is sleeping, the kernel can just requst that a slice be given
  37 to those two threads.
  38
  39 ## 03:37
  40
  41 But back to the magic, the magic stuff allows stranger and more dangerous
  42 transformations of Java code such as enabling direct register access and such.
  43 Only the core hypervisor will be compiled with this in mind, the kernel will
  44 not be and will instead rely on the core hypervisor to manage maps and such.
  45 So when the kernel wants to map some memory for a process (perhaps for a
  46 memory mapped file) it can just request from the hypervisor to allocate (but
  47 not to fill) a virtual page for the specified process. Then when a page fault
  48 is performed on it, things happen. This means that creation of direct
  49 ByteBuffers will be handled by the core hypervisor (if it allows making one)
  50 and the objects will be gifted or made visible to a process that requested it.
  51 If the process dies or garbage collects the direct byte buffer then it is
  52 removed. Normal non-direct ByteBuffers are always backed by an array anyway so
  53 when those are created it is just a new byte array. Direct byte buffers are a
  54 window (managed by the core hypervisor, it creates this object) to real
  55 memory. Direct byte buffers are required to be initialized to zero however a
  56 kernel could request a direct buffer that does not clear its contents. If any
  57 clearing is specified it will not be done until the buffer is actually
  58 accessed (that is it page faults), and a direct byte buffer might be split on
  59 page boundaries of which they might be allocated and others might not be. The
  60 direct byte buffer would then need an owner which manages its contents and
  61 paging status and which thread/process triggered it. So if it is mapped to a
  62 file, then the process(es) accessing it will block until the kernel signals a
  63 ready. Data from a file will be read somewhere into that memory region then
  64 the page will be activated to that freshly allocated region for the subprocess
  65 and it will be resumed.
  66
  67 ## 04:55
  68
  69 What I need to do is figure out the format of KBF files, I did make a giant
  70 map before but that would be insufficient to handle all the new stuff in the
  71 class file and everything in it (I did not know about a bunch of the stuff I
  72 know now then). I will need it to contain annotations, but the main thing
  73 would be methods and the ABI, both of those need to work. My main goal is so
  74 that a class only need to be loaded in memory once for every process, this
  75 means that if one process loads the class SocketChannel it gets loaded once in
  76 memory. All processes on the same kernel will share the same boot classpath
  77 and cannot override it. Then later on, another process decides to load
  78 SocketChannel, since it is already in memory the process just needs to
  79 initialize the class static stuff and linkage information. So this means that
  80 every class will need linkage zones that are used by the method code.
  81 Basically a per-class import and export table for that class (the table being
  82 static) that is basically a function pointer elsewhere in memory to that
  83 specific class code. However the one complex thing to handle would be
  84 interfaces. Interfaces could be in any order and have varying descriptors
  85 associated with them. It is not possible to statically bind where something
  86 points to in an interface. However, it could be done statically. When a class
  87 is initialized it may have implemented interfaces. In this case the class has
  88 an interface table with the expected exports for that type for each interface.
  89 So lets say that there are three classes: A, B, C, and an interface I. B and C
  90 both implement I, and A wants to call something in the interface I. Due to the
  91 varying order of methods A cannot know where the method lies in B and C. B and
  92 C will both have a table that is common for the interface which maps where the
  93 imports and exports go for the interface type. So A would obtain the interface
  94 table pointer for the specific B and C objects for the interface, then do a
  95 normal jump with the function pointer. Now the main issue is determining the
  96 actual interface specifier for a random class. I could do a linear search but
  97 that would be insanely slow as that would have to be done for all interfaces.
  98 On a sub note, subclasses that extend parent classes of which are not static
  99 will need to just stack on all of their fields, so that something similar to
 100 struct magic like in C will work where you can have fake object orientation
 101 with multiple structs that both have another but the same struct as the first
 102 member. You could cast both structs to that same member struct, type.
 103 Interfaces only have public static final fields so that is a non-issue.
 104 Anyway, I will need to think about this. The struct stuff mentioned before is
 105 called aliasing.
 106
 107 ## 05:31
 108
 109 Thought about a least common denominator table for interfaces. Meaning that
 110 each class that is an interface gets an index which is static which points to
 111 an interface map for each class, where the interface table is located so it
 112 can execute methods and such. However, that would not work because if multiple
 113 interfaces get loaded they may get placed in the same slot. The only
 114 alternative that would be safe would be to have each interface have its own
 115 index, where then a class would have all of the interfaces mapped in. So say
 116 if 2000 interfaces are loaded on a 32-bit system, that would add an overhead
 117 of 8000 bytes to every loaded class so it can store the entire gigantic table
 118 for every interface that is possible. It would be fast as the classes could
 119 just index into it by the interface index offset. It would dereference the
 120 table then jump to the specified pointer in the table. It can work but that is
 121 not memory efficient. I do know one thing I can do regardless, is that I can
 122 always optimize the known stuff. The wrapped integer and float stuff and the
 123 Math and StrictMath classes. That is, instead of actually calling Math.pow()
 124 on a number, the compiler will just optimize the value at compile time. Since
 125 these are very well known classes that it would be horribly unoptimized to
 126 just call the method anyway. However, the method itself in the class will
 127 still be implemented (Math being as fast as possible while StrictMath being as
 128 accurate as possible). The same can go for the primitive wrapper types too, so
 129 stuff like Long.divideUnsigned() is never really called but optimized by the
 130 compiler to have the expected effect. Another thing that could be optimized is
 131 native type boxing. There would be a difference between stuff like `Object o =
 132 2` compared to `foo("%d%n", 2)`. The compiler will have to detect if a boxed
 133 type is even wanted to be in an object. Although calling another method with a
 134 boxed type will require it to be created. So how would it be possible to make
 135 it so that it never does get created but is passed by value. Then if the
 136 method being invoked never makes the passed integer visible anywhere it does
 137 not have to create an object for it. I suppose for those objects, they can
 138 retain a very basic form. Still have a class identity but have a flat class
 139 type so that they are basically just the class identifier and the value they
 140 contain. Thus they would have no large overhead in memory but will still
 141 require that they are allocated. So there will need to be a way to determine
 142 stack locality. If an object is located on the stack in one method and it
 143 becomes exposed to outside objects, the sub method should be able to move the
 144 object from the stack and make it visible while filling the former stack area
 145 with a special inidicator that it got removed from the stack and placed at a
 146 specific region in memory. The stack would have to able to be grown similar to
 147 alloca in C, and new could either just add to the stack or create an object in
 148 the heap. So new would need some hints, if the object is just going to be set
 149 to a field then it can be allocated on the heap, if it is never exposed then
 150 it could be placed on the stack. So each allocation hint will have two states,
 151 ONSTACK and INHEAP. The calling class on new will not have any idea if the
 152 object violates something, so if it hints with ONSTACK and it turns out the
 153 object exposes itself everywhere then it will allocate on the heap instead as
 154 if INHEAP were called. So code will have to handle in the event that an object
 155 needs to be moved from the stack to the heap. Then if circumstances permit, it
 156 may be possible to move the object from the heap into the stack but that would
 157 mean that it has no references elsewhere so that would be limited. So object
 158 handling would have to handle cases where it is on the stack and in the heap.
 159
 160 ## 05:59
 161
 162 Having interfaces hashing to a slot would be good, but the hash would need to
 163 be really good to not collide. However, the main thing that would be an issue
 164 is if a class decides to implement 65,535 interfaces, that would be very bad
 165 for hashing.
 166
 167 ## 09:18
 168
 169 Another thing the compiler can optimize is the Atomic boxed classes which
 170 provide atomic interfaces. This would be so that two different CPUs which
 171 might see different memory can correctly read and write values in that they
 172 are truly atomic as per CPU.
 173
 174 ## 10:39
 175
 176 Need to go into describing stack operations on the byte code ops.
 177
 178 ## 13:01
 179
 180 After much typing, only 18 more opcodes to describe. Then once that is done I
 181 can start implementing a byte code reader so that methods are read and
 182 described. From there I can make super generic optimization and translation to
 183 reduce the number of rewrites for every architecture.
 184