3 ***DISCLAIMER***: _These notes are from the defunct k8 project which_
4 _precedes SquirrelJME. The notes for SquirrelJME start on 2016/02/26!_
5 _The k8 project was effectively a Java SE 8 operating system and as such_
6 _all of the notes are in the context of that scope. That project is no_
7 _longer my goal as SquirrelJME is the spiritual successor to it._
11 Need to determine when a token ends, I have the validity list which is known
12 between old and new states. So there have to be boundary characters which are
13 used to separate some form of tokens. I can also make every operators a token
14 of a specific string. Although it may look like multiple operators such as +
15 which are connected to each other might be deemed as separate, the actual
16 token would be determine based on the validity of the token before any extra
21 Even though my code does not do much, it appears there are always valid tokens
22 and I realize this is because of the floating point digits probably. Since
23 quite literally any of it is optional and it can include nothing.
27 However for hexafloats, it is not needed because the exponent marker must
28 always be there for that case.
32 Actually one of my integer regex is working because for the StringBuilder I am
33 appending an integer and not a character sot he sequences are quite literally
38 However some token forms that do not follow normal non-line boundaries such as
39 strings or character sequences. So if at least one token is valid and a
40 boundary is met, then any other token where boundaries are not ignored are
41 then tossed away as invalid.
45 Actually for strings that is not required because the token would still be
46 valid even if there is a separation character in the middle of it. The
47 separation characters are only used then in this case when there are no more
48 valid tokens. If a token ends just before a separation sequence then it is
49 stopped and it is considered valid.
53 I need to remember to include line and column information in the tokenizer
58 Need to handle ending multi-line comments.
62 Multi-lines are all implement and I added a bunch of operators but now it
63 seems my hexadecimal regex is not correct. I also need to remember to do stuff
64 that I cannot remember due to being tired. Yes, when I get a token I need to
65 apply to the type information any extra annotations attached to a token
66 because that would be very useful. Annotations are a way to peg extra data
67 without needing to mess up the enumeration or interfaces or have some kind of
68 ugly lookup table. Regex actually needs a potential increase in complexity
69 because by the time that "0x" is read it does not comprise a valid hexadecimal
70 number so I will need a partial regex which is valid enough. So a partial
71 regex match would then become virtually valid, but not fully valid. A partial
72 match would never be used for the type of a token but can contain enough
73 information to wildly stay attached to the regex. This means that it will
74 share a similar regex but where every part is optional so that it remains in
79 May need to recreate my floating point regexes because they might be wrong.
83 Added all the annotations to the token, made separation tokens possible, and
84 now working on string literals. And as I predicted my floating point regex are
85 not correct because the tokenizer stops on ".fo" in "String.format".
89 Cannot seem to solve floating point literals with regex without making a
90 gigantic mess, so I am going to stick to method parsing of it.
94 Floating points getting stuck on "e" in ".err" is due to the exponent.
98 Now that tokens are pretty much all parsed (although I need to rewrite the hex
99 floating point), I can begin work on the stage two generator so I must outline
100 the specified classes which are compiled.
104 Will need to do actual handling of tokens and parsing them all.