assets/developer-notes/stephanie-gawroriski/2014/09/22.mkd

   1 # 2014/09/22
   2
   3 ***DISCLAIMER***: _These notes are from the defunct k8 project which_
   4 _precedes SquirrelJME. The notes for SquirrelJME start on 2016/02/26!_
   5 _The k8 project was effectively a Java SE 8 operating system and as such_
   6 _all of the notes are in the context of that scope. That project is no_
   7 _longer my goal as SquirrelJME is the spiritual successor to it._
   8
   9 ## 01:47
  10
  11 Need to determine when a token ends, I have the validity list which is known
  12 between old and new states. So there have to be boundary characters which are
  13 used to separate some form of tokens. I can also make every operators a token
  14 of a specific string. Although it may look like multiple operators such as +
  15 which are connected to each other might be deemed as separate, the actual
  16 token would be determine based on the validity of the token before any extra
  17 characters are added.
  18
  19 ## 02:03
  20
  21 Even though my code does not do much, it appears there are always valid tokens
  22 and I realize this is because of the floating point digits probably. Since
  23 quite literally any of it is optional and it can include nothing.
  24
  25 ## 02:10
  26
  27 However for hexafloats, it is not needed because the exponent marker must
  28 always be there for that case.
  29
  30 ## 02:13
  31
  32 Actually one of my integer regex is working because for the StringBuilder I am
  33 appending an integer and not a character sot he sequences are quite literally
  34 always valid.
  35
  36 ## 02:28
  37
  38 However some token forms that do not follow normal non-line boundaries such as
  39 strings or character sequences. So if at least one token is valid and a
  40 boundary is met, then any other token where boundaries are not ignored are
  41 then tossed away as invalid.
  42
  43 ## 05:42
  44
  45 Actually for strings that is not required because the token would still be
  46 valid even if there is a separation character in the middle of it. The
  47 separation characters are only used then in this case when there are no more
  48 valid tokens. If a token ends just before a separation sequence then it is
  49 stopped and it is considered valid.
  50
  51 ## 06:37
  52
  53 I need to remember to include line and column information in the tokenizer
  54 code.
  55
  56 ## 06:45
  57
  58 Need to handle ending multi-line comments.
  59
  60 ## 07:12
  61
  62 Multi-lines are all implement and I added a bunch of operators but now it
  63 seems my hexadecimal regex is not correct. I also need to remember to do stuff
  64 that I cannot remember due to being tired. Yes, when I get a token I need to
  65 apply to the type information any extra annotations attached to a token
  66 because that would be very useful. Annotations are a way to peg extra data
  67 without needing to mess up the enumeration or interfaces or have some kind of
  68 ugly lookup table. Regex actually needs a potential increase in complexity
  69 because by the time that "0x" is read it does not comprise a valid hexadecimal
  70 number so I will need a partial regex which is valid enough. So a partial
  71 regex match would then become virtually valid, but not fully valid. A partial
  72 match would never be used for the type of a token but can contain enough
  73 information to wildly stay attached to the regex. This means that it will
  74 share a similar regex but where every part is optional so that it remains in
  75 the valid list.
  76
  77 ## 08:06
  78
  79 May need to recreate my floating point regexes because they might be wrong.
  80
  81 ## 19:44
  82
  83 Added all the annotations to the token, made separation tokens possible, and
  84 now working on string literals. And as I predicted my floating point regex are
  85 not correct because the tokenizer stops on ".fo" in "String.format".
  86
  87 ## 20:22
  88
  89 Cannot seem to solve floating point literals with regex without making a
  90 gigantic mess, so I am going to stick to method parsing of it.
  91
  92 ## 21:01
  93
  94 Floating points getting stuck on "e" in ".err" is due to the exponent.
  95
  96 ## 21:21
  97
  98 Now that tokens are pretty much all parsed (although I need to rewrite the hex
  99 floating point), I can begin work on the stage two generator so I must outline
 100 the specified classes which are compiled.
 101
 102 ## 23:22
 103
 104 Will need to do actual handling of tokens and parsing them all.
 105