x86_64_sse2_x87/fasm/docs/manual.txt

   1 \r
   2 \r
   3 flat assembler g\r
   4 User Manual\r
   5 \r
   6 \r
   7 This document describes the syntax of flat assembler g language, with basic\r
   8 examples. It was written with an assumption that it would be read sequentially\r
   9 and at any moment it uses only the concepts and constructions that have been\r
  10 introduced earlier. However it should be possible to jump right to the\r
  11 section that interests the reader, and then go back to earlier parts only when\r
  12 it is needed in order to better understand the later ones.\r
  13 \r
  14 \r
  15 Table of contents\r
  16 \r
  17 0. Executing the assembler\r
  18 1. Fundamental syntax rules\r
  19 2. Symbol identifiers\r
  20 3. Basic symbol definitions\r
  21 4. Expression values\r
  22 5. Symbol classes\r
  23 6. Generating data\r
  24 7. Conditional assembly\r
  25 8. Macroinstructions\r
  26 9. Labeled macroinstructions\r
  27 10. Symbolic variables and recognition context\r
  28 11. Repeating blocks of instructions\r
  29 12. Matching parameters\r
  30 13. Output areas\r
  31 14. Source and output control\r
  32 15. CALM instructions\r
  33 \r
  34 \r
  35 0. Executing the assembler\r
  36 \r
  37 To start assembly from the command line it is necessary to provide at least one\r
  38 parameter, the name of a source file, and optionally a second one -\r
  39 name of the destination file. If the assembly is successful, the generated\r
  40 output is written into the destination and a short summary is displayed,\r
  41 otherwise an information about errors is shown. The maximum number of presented\r
  42 errors can be controlled with an additional "-e" switch (by default no more than\r
  43 one error is presented). The "-p" switch controls the maximum number of passes\r
  44 the assembler is going to attempt. This limit is by default set to 100.\r
  45 The "-r" switch allows to set up the limit of the recursion stack, that is the\r
  46 maximum allowed depth of entering macroinstructions and including additional\r
  47 source files. The "-v" switch can enable showing all the lines from this stack\r
  48 when reporting an error (by default the assembler tries to select only the\r
  49 lines that are likely the most informative, but this simple heuristic may not\r
  50 always be correct). If "-v" switch is used with value 2, it in addition makes\r
  51 all the messages displayed by commands from the source text to be shown in real\r
  52 time (in every consecutive pass). The "-i" switch allows to insert any command at\r
  53 the beginning of processed source.\r
  54 \r
  55 \r
  56 1. Fundamental syntax rules\r
  57 \r
  58 Every command in the assembly language occupies a single line of text.\r
  59 If a line contains the semicolon character, everything from that character up\r
  60 to the end of the line is treated as a comment and ignored by the assembler.\r
  61 The main part of a line (i.e. excluding the comment) may end with the backslash\r
  62 character and in such case the next line from the source text is going to be\r
  63 appended to this one. This allows to split any command across multiple lines,\r
  64 when needed. From now on we will refer to a source line as an entity obtained\r
  65 by stripping comments and joining the lines of text connected with backslash\r
  66 characters.\r
  67   The text of source line is divided into syntactical units called tokens.\r
  68 There is a number of special characters that become separate tokens all by\r
  69 themselves. Any of the characters listed below is such a syntactical unit:\r
  70 \r
  71     +-/*=<>()[]{}:?!,.|&~#`\\r
  72 \r
  73 Any contiguous (i.e. not broken by whitespace) sequence of characters other than\r
  74 the above ones becomes a single token, which can be a name or a number.\r
  75 The exception to this rule is when a sequence starts with the single or the double\r
  76 quote character. This defines a quoted string and it may contain any of the\r
  77 special characters, whitespace and even semicolons, as it ends only when the\r
  78 same character that was used to start it is encountered. The quotes that are\r
  79 used to enclose the string do not become a part of the string themselves.\r
  80 If it is needed to define a string containing the same character that is used to\r
  81 enclose it, the character needs to be doubled inside the string - only one copy\r
  82 of the character will become a part of the string, and the sequence will\r
  83 continue.\r
  84   Numbers are distinguished from names by the fact that they either\r
  85 begin with a decimal digit, or with the "$" character followed by any hexadecimal\r
  86 digit. This means that a token can be considered numeric even when it is not a\r
  87 valid number. To be a correct one it must be one of the following: a decimal\r
  88 number (optionally with the letter "d" attached at the end), a binary number\r
  89 followed by the letter "b", an octal number followed by the letter "o" or "q", or a\r
  90 hexadecimal number either prepended with "$" or "0x", or followed by the character\r
  91 "h". Because the first digit of a hexadecimal number can be a letter, it may be\r
  92 needed to prepend it with the digit zero in order to make it recognizable as a number.\r
  93 For example, "0Ah" is a valid number, while "Ah" is just a name.\r
  94 \r
  95 \r
  96 2. Symbol identifiers\r
  97 \r
  98 Any name can become a defined symbol by having some meaning (a value) assigned to it.\r
  99 One of the simplest methods of creating a symbol with a given value is to use\r
 100 the "=" command:\r
 101 \r
 102         a = 1\r
 103 \r
 104 The ":" command defines a label, that is a symbol with a value equal to the\r
 105 current address in the generated output. At the beginning of the source text this\r
 106 address is always zero, so when the following two commands are the first ones\r
 107 in the source file, they define symbols that have identical values:\r
 108 \r
 109         first:\r
 110         second = 0\r
 111 \r
 112 Labels defined with ":" command are special constructs in assembly language,\r
 113 since they allow any other command (including another label definition) to\r
 114 follow in the same line. This is the only kind of command that allows this.\r
 115   What comes before the ":" or "=" character in such definition is a symbol\r
 116 identifier. It can be a simple name, like in the above samples, but it may\r
 117 also contain some additional modifiers, described below.\r
 118   When a name in a symbol definition has the "?" character appended to it (with\r
 119 no whitespace between them), the symbol is case-insensitive (otherwise it would\r
 120 be defined as case-sensitive). This means that the value of such\r
 121 symbol may be referred to (as in an expression to the right of the "=" character)\r
 122 by the name being any variant of the original name that differs only in the case\r
 123 of letters. Only the cases of the 26 letters of the English alphabet are\r
 124 allowed to differ, though.\r
 125   It is possible to define a case-sensitive symbol that clashes with a\r
 126 case-insensitive one. Then the case-sensitive symbol takes precedence and the more\r
 127 general one is used only when corresponding case-sensitive symbol is not defined.\r
 128 This can be remedied by using the "?" modifier, since it always means that the name\r
 129 followed by it refers to the case-insensitive symbol.\r
 130 \r
 131         tester? = 0\r
 132         tester = 1\r
 133         TESTER = 2\r
 134         x = tester       ; x = 1\r
 135         y = Tester       ; y = 0\r
 136         z = TESTER       ; z = 2\r
 137         t = tester?      ; t = 0\r
 138 \r
 139   Every symbol has its own namespace of descendants, called child namespace. When two\r
 140 names are connected with a dot (with no whitespace in between), such identifier refers to\r
 141 an entity named by the second one in the namespace of descendants to the symbol specified\r
 142 by the first one. This operation can be repeated many times within a single identifier,\r
 143 allowing to refer to descendants of descendants in a chain of any length.\r
 144 \r
 145         space:\r
 146         space.x = 1\r
 147         space.y = 2\r
 148         space.color:\r
 149         space.color.r = 0\r
 150         space.color.g = 0\r
 151         space.color.b = 0\r
 152 \r
 153 Any of the names in such chain may optionally be followed by the "?" character\r
 154 to mark that it refers to a case-insensitive symbol. If "?" is inserted in\r
 155 the middle of the name (effectively splitting it into separate tokens) such\r
 156 identifier is considered a syntactical error.\r
 157   When an identifier starts with a dot (in other words: when the name of the parent\r
 158 symbol is empty), it refers to the symbol in the namespace of the most recent \r
 159 regular label defined before current line. This allows to rewrite the above sample\r
 160 like this:\r
 161 \r
 162         space:\r
 163         .x = 1\r
 164         .y = 2\r
 165         .color:\r
 166         .color.r = 0\r
 167         .color.g = 0\r
 168         .color.b = 0\r
 169 \r
 170 After the "space" label is defined, it becomes the most recently defined normal\r
 171 label, so the following ".x" refers to the "space.x" symbol and then the ".color"\r
 172 refers to the "space.color".\r
 173   The "namespace" command followed by a symbol identifier changes the base\r
 174 namespace for a section of source text. It must be paired with the\r
 175 "end namespace" command later in the source to mark the end of such block.\r
 176 This can be used to again rewrite the above sample in a different way:\r
 177 \r
 178         space:\r
 179         namespace space\r
 180                 x = 1\r
 181                 y = 2\r
 182                 color:\r
 183                 .r = 0\r
 184                 .g = 0\r
 185                 .b = 0\r
 186         end namespace\r
 187 \r
 188 When a name is not preceded by a dot, and as such it does not have explicitly\r
 189 specified in what namespace the symbol resides, the assembler looks for defined\r
 190 symbol in the current namespace, and if none is found, in the consecutive namespaces\r
 191 of parent symbols, starting from the namespace containing the parent symbol of\r
 192 current namespace. If no defined symbol with such name is found, it is assumed that\r
 193 the name refers to the symbol in the current namespace (and unless there is "?"\r
 194 character after such name, it is assumed that the symbol is case-sensitive).\r
 195 A definition that does not specify the namespace where the new symbol should be\r
 196 created, always makes a new symbol in the current base namespace.\r
 197 \r
 198         global = 0\r
 199         regional = 1\r
 200         namespace regional\r
 201                 regional = 2            ; regional.regional = 2\r
 202                 x = global              ; regional.x = 0\r
 203                 regional.x = regional   ; regional.regional.x = 2\r
 204                 global.x = global       ; global.x = 0\r
 205         end namespace\r
 206 \r
 207 The comments in the above sample show equivalent definitions with respect\r
 208 to the original base namespace. Note that when a name is used to specify the\r
 209 namespace, the assembler looks for a defined symbol with such name to lookup in\r
 210 its namespace, but when it is a name of a symbol to be defined, it is always\r
 211 created within the current base namespace.\r
 212   When the final dot of an identifier is not followed by any name, it refers\r
 213 to the parent symbol of the namespace that would be searched for a symbol if\r
 214 there was a name after this dot. Adding such dot at the end of an identifier may\r
 215 appear redundant, but it can be used to alter the way the definition of a symbol\r
 216 works, because it forces the assembler to look for an already existing symbol that\r
 217 it can alter instead of squarely creating a new one in the current namespace. \r
 218 For instance, if in the fourth line of the previous example "regional." was put\r
 219 in place of "regional", it would rewrite a value of the original "regional"\r
 220 symbol instead of making a new symbol in the child namespace. Similarly,\r
 221 a definition formed this way may assign a new value to a symbol regardless of\r
 222 whether it was previously defined as case-insensitive or not.\r
 223   If an identifier is just a single dot, by the above rules it refers to the most\r
 224 recent label that did not start with a dot. This can be applied to rewrite\r
 225 the earlier example in yet another way:\r
 226 \r
 227         space:\r
 228         namespace .\r
 229                 x = 1\r
 230                 y = 2\r
 231                 color:\r
 232                 namespace .\r
 233                         r = 0\r
 234                         g = 0\r
 235                         b = 0\r
 236                 end namespace\r
 237         end namespace\r
 238 \r
 239 It also demonstrates how namespace sections can be nested one within another.\r
 240   The "#" may be inserted anywhere inside an identifier without changing its\r
 241 meaning. When "#" is the only character separating two name tokens, it causes\r
 242 them to be interpreted as a single name formed by concatenating the tokens.\r
 243 \r
 244         variable = 1\r
 245         varia#ble = var#iable + 2       ; variable = 3\r
 246 \r
 247 This can also be applied to numbers.\r
 248   Inside a block defined with "namespace" there is initially no label that would\r
 249 be considered base for identifiers starting with dot (however the label that \r
 250 served this purpose earlier is brought back to use after "end namespace").\r
 251 A similar thing also happens in the beginning of the source text, before any\r
 252 label has been defined. This is connected to a couple of additional rules\r
 253 concerning use of dots in identifiers.\r
 254   When an identifier starts with a dot, but there is no label that would be\r
 255 a parent for it, the identifier refers to the descendant of a special symbol\r
 256 that resides in the current namespace but has no name. If an identifier starts\r
 257 with a sequence of two or more dots, the identifier refers to the descedant of\r
 258 a similar unnamed symbol, but it is a distinct one for any given number of dots.\r
 259 While the namespace accessed with a single starting dot changes every time a new\r
 260 regular label is defined, the special namespace accessed with two or more dots\r
 261 in the beginning of an identifier remains the same:\r
 262 \r
 263         first:\r
 264                 .child = 1\r
 265                 ..other = 0\r
 266         second:\r
 267                 .child = 2\r
 268                 ..another = ..other\r
 269 \r
 270 In this example the meaning of the ".child" identifier changes from place to \r
 271 place, but the "..other" identifier means the same everywhere.\r
 272   When two names inside an identifier are connected with a sequence of two or\r
 273 more dots, the identifier refers to the descendant of such special unnamed\r
 274 symbol in the namespace specified by the identifier before that sequence of\r
 275 dots. The unnamed child namespace is chosen depending on a number of dots and\r
 276 in this case the number of required dots is increased by one. The following\r
 277 example demonstrates the two methods of identifying such symbol:\r
 278 \r
 279         namespace base\r
 280                 ..other = 1\r
 281         end namespace\r
 282 \r
 283         result = base.#..other\r
 284 \r
 285 The "#" character has been inserted into the last identifier for a better\r
 286 readability, but the plain sequence of three dots would do the same.\r
 287   The unnamed symbol that hosts a special namespace can itself be accessed\r
 288 when an identifier ends with a sequence of two or more dots - thanks to the\r
 289 rule that an identifier which ends in a dot refers to the parent symbol of\r
 290 the namespace that would be accessed if there was a name after this dot. So\r
 291 in the context of the previous example the "base..." (or "base.#..") would\r
 292 refer to the unnamed parent of the namespace where the "other" symbol resides,\r
 293 and it would be the same symbol as identified by simple ".." inside the\r
 294 namespace of the "base" symbol.\r
 295   Any identifier can be prepended with a "?" character and such modifier has\r
 296 an effect when it is used in a context where identifier could mean something\r
 297 different than a label or variable to be defined. This modifier then \r
 298 suppresses any other interpretation. For example, identifier starting with "?"\r
 299 is not going to be treated as an instruction, even if it is the first symbol\r
 300 on the line. This can be used to define a variable that shares a name with\r
 301 an existing command:\r
 302 \r
 303         ?namespace = 0\r
 304 \r
 305 If such modified identifier is used in a place where it is evaluated and not\r
 306 defined, it still refers to the same symbol it would refer to in a definition.\r
 307 Therefore, unless identifier also uses a dot, it always refers to a symbol\r
 308 in the current namespace.\r
 309   A number can be used in a role of a name inside an identifier, but not when\r
 310 it is placed at the beginning, because then it is considered a literal value.\r
 311 This restriction also may be bypassed by prepending an identifier with "?".\r
 312 \r
 313 \r
 314 3. Basic symbol definitions\r
 315 \r
 316 When a symbol is defined as a label, it must be the only definition of\r
 317 this symbol in the entire source. A value that is assigned to the symbol this way\r
 318 can be accesed from every place in the source, even before the label is actually\r
 319 defined. When a symbol is used before it is defined (this is often called\r
 320 forward-referencing) the assembler tries to correctly predict the value of\r
 321 the symbol by doing multiple passes over the source text. Only when all\r
 322 predictions prove to be correct, the assembler generates the final output.\r
 323   This kind of symbol, which can only be defined once and thus have a universal\r
 324 value that can always be forward-referenced, is called a constant. All labels\r
 325 are constants.\r
 326   When a symbol is defined with a "=" command, it may have multiple definitions\r
 327 of this kind. Such symbol is called variable and when it is used, the value from\r
 328 its latest definition is accessed. A symbol defined with such command may also be\r
 329 forward-referenced, but only when it is defined exactly once in the entire\r
 330 source and as such has a single unambiguous value.\r
 331 \r
 332         a = 1           ; a = 1\r
 333         a = a + 1       ; a = 2\r
 334         a = b + 1       ; a = 3\r
 335         b = 2\r
 336 \r
 337   A special case of forward-referencing is self-referencing, when the value\r
 338 of a symbol is used in its own definition. The assembly of such construct is\r
 339 successful only when the assembler is able to find a value that is stable under\r
 340 such evaluation, effectively solving an equation. But due to the simplicity\r
 341 of the resolving algorithm based on predictions a solution may not be found even\r
 342 when it exists.\r
 343 \r
 344         x = (x-1)*(x+2)/2-2*(x+1)       ; x = 6 or x = -1\r
 345 \r
 346   The ":=" defines a constant value. It may be used instead of "=" to\r
 347 ensure that the given symbol is defined exactly once and that it can be\r
 348 forward-referenced.\r
 349   The "=:" defines a variable symbol like "=", but it differs in how\r
 350 it treats the previous value (when such exists). While "=" discards the\r
 351 previous value, "=:" preserves it so it can later be brought back with the\r
 352 "restore" command:\r
 353 \r
 354         a = 1\r
 355         a =: 2          ; preserves a = 1\r
 356         a = 3           ; discards a = 2 and replaces it with a = 3\r
 357         restore a       ; brings back a = 1\r
 358 \r
 359 A "restore" may be followed by multiple symbol identifiers separated with\r
 360 commas, and it discards the latest definition of every one of them. It is not\r
 361 considered an error to use "restore" with a symbol that has no active\r
 362 definition (either because it was never defined or because all of its\r
 363 definitions were already discarded earlier). If a symbol is treated with the\r
 364 "restore" command, it becomes a variable and can never be forward-referenced.\r
 365 For this reason "restore" cannot be applied to constants.\r
 366   The "label" keyword followed by a symbol identifier is an alternative way\r
 367 of defining a label. In this basic form it is equivalent to a definition made\r
 368 with ":", but it occupies an entire line. However with this command it is\r
 369 possible to provide more settings for the defined label. The identifier may\r
 370 be optionally followed by the ":" token and then an additional value to be\r
 371 associated with this label (usually denoting the size of the labeled entity).\r
 372 The assembler has a number of built-in constants defining various sizes for\r
 373 this purpose, but this value can also be provided as a plain number.\r
 374 \r
 375         label character:byte\r
 376         label char:1\r
 377 \r
 378 The ":" character may be omitted in favor of a plain whitespace, but it is\r
 379 recommended for clarity. After an identifier and an optional size, the "at"\r
 380 keyword may follow and then a value that should be assigned to the label instead\r
 381 of the current address.\r
 382 \r
 383         label wchar:word at char\r
 384 \r
 385   The built-in size constants are equivalent to the following set of\r
 386 definitions:\r
 387 \r
 388         byte? = 1       ; 8 bits\r
 389         word? = 2       ; 16 bits\r
 390         dword? = 4      ; 32 bits\r
 391         fword? = 6      ; 48 bits\r
 392         pword? = 6      ; 48 bits\r
 393         qword? = 8      ; 64 bits\r
 394         tbyte? = 10     ; 80 bits\r
 395         tword? = 10     ; 80 bits\r
 396         dqword? = 16    ; 128 bits\r
 397         xword? = 16     ; 128 bits\r
 398         qqword? = 32    ; 256 bits\r
 399         yword? = 32     ; 256 bits\r
 400         dqqword? = 64   ; 512 bits\r
 401         zword? = 64     ; 512 bits\r
 402 \r
 403   The "element" keyword followed by a symbol identifier defines a special\r
 404 constant that has no fixed value and can be used as a variable in the linear\r
 405 polynomials. The identifier may be optionally followed by the ":" token and\r
 406 then a value to be associated with this symbol, called metadata of the\r
 407 element.\r
 408 \r
 409         element A\r
 410         element B:1\r
 411 \r
 412   The metadata assigned to a symbol can be extracted with a special operator,\r
 413 defined in the next section.\r
 414 \r
 415 \r
 416 4. Expression values\r
 417 \r
 418 In every construction described so far where a value of some kind was\r
 419 provided, like after the "=" command or after the "at" keyword, it could be\r
 420 a literal value (a number or a quoted string) or a symbol identifier. \r
 421 A value can also be specified through an expression containing built-in\r
 422 operators.\r
 423   The "+", "-" and "*" perform standard arithmetic operations on integers \r
 424 ("+" and "-" can also be used in a unary form - with only one argument).\r
 425 "/" and "mod" perform division with remainder, giving a quotient or a remainder\r
 426 respectively. Of these arithmetic operators "mod" has the highest precedence \r
 427 (it is calculated first), "*" and "/" come next, while "+" and "-" are evaluated\r
 428 last (even in their unary variants). Operators with the same precedence are\r
 429 evaluated from left to right. Parentheses can be used to enclose sub-expressions\r
 430 when a different order of operations is required.\r
 431   The "xor", "and" and "or" perform bitwise operations on numbers. "xor" is\r
 432 addition of bits (exclusive or), "and" is multiplication of bits, and "or" is\r
 433 inclusive or (logical disjunction). These operators have higher precedence\r
 434 than any arithmetic operators.\r
 435   The "shl" and "shr" perform bit-shifting of the first argument by the amount\r
 436 of bits specified by the second one. "shl" shifts bits left (towards the higher\r
 437 powers of two), while "shr" shifts bits right (towards zero), dropping bits that\r
 438 fall into the fractional range. These operators have higher precedence than other\r
 439 binary bitwise operations.\r
 440   The "not", "bsf" and "bsr" are unary operators with even higher precedence.\r
 441 "not" inverts all the bits of a number, while "bsf" and "bsr" search for the\r
 442 lowest or highest set bit respectively, and give the index of that bit as a\r
 443 result.\r
 444   All the operations on numbers are performed as if they were done on the\r
 445 infinite 2-adic representations of those numbers. For example the "bsr" with a\r
 446 negative number as an argument gives no valid result, since such number has an\r
 447 infinite chain of set bits extending towards infinity and as such contains no\r
 448 highest set bit (this is signaled as an error).\r
 449   The "bswap" operator allows to create a string of bytes containing the \r
 450 representation of a number in a reverse byte order (big endian). The second\r
 451 argument to this operator should be the length in bytes of the required string.\r
 452 This operator has the same precedence as the "shl" and "shr" operators.\r
 453   When a string value is used as an argument to any of the operations on\r
 454 numbers, it is treated as a sequence of bits and automatically converted into\r
 455 a positive number (extended with zero bits towards the infinity). The\r
 456 consecutive characters of a string correspond to the higher and higher bits of a\r
 457 number.\r
 458   To convert a number back to a string, the "string" unary operator may be\r
 459 used. This operator has the lowest possible precedence, so when it precedes\r
 460 an expression, all of it is evaluated prior to the conversion. When conversion\r
 461 in the opposite direction is needed, simple unary "+" is enough to make a string\r
 462 become a number.\r
 463   The length of a string may be obtained with the "lengthof" unary operator. This\r
 464 operator can only be applied to a string and it is one of the operators with the\r
 465 highest precedence.\r
 466   When a symbol defined with the "element" command is used in an expression the\r
 467 result may be a linear polynomial in a variable represented by the symbol.\r
 468 Only simple arithmetic operations are allowed on the terms of a polynomial,\r
 469 and it must stay linear - so, for example, it is only allowed to multiply a\r
 470 polynomial by a number, but not by another polynomial.\r
 471   There are a few operators with high precedence that allow to extract the information\r
 472 about the terms of linear polynomial. The polynomial should come as the first argument,\r
 473 and the index of the term as the second one. The "element" operator extracts\r
 474 the variable of a polynomial term (with the coefficient of one), the "scale" operator\r
 475 extracts the coefficient (a number by which the variable is multiplied) and "metadata"\r
 476 operator gives back the metadata associated with the variable.\r
 477   When the second argument is an index higher than the index of the last term\r
 478 of the polynomial, all three operators return zero. When the second argument\r
 479 is zero, "element" and "scale" give information about the constant term -\r
 480 "element" returns numeric 1 and "scale" returns the value of the constant term.\r
 481 \r
 482         element A\r
 483         linpoly = A + A + 3\r
 484         vterm = linpoly scale 1 * linpoly element 1     ; vterm = 2 * A\r
 485         cterm = linpoly scale 0 * linpoly element 0     ; cterm = 3 * 1\r
 486 \r
 487   The "metadata" operator with an index of zero returns the size that is associated\r
 488 with the first argument. This value is definite only when the first argument is\r
 489 a symbol that has a size associated with it (or an arithmetic expression\r
 490 that contains such symbol), otherwise it is zero. There exists an additional\r
 491 unary operator "sizeof", which gives the same value as "metadata 0".\r
 492 \r
 493         label table : 256\r
 494         length = sizeof table   ; length = 256\r
 495 \r
 496   The "elementof", "scaleof" and "metadataof" are variants of "element", "scale"\r
 497 and "metadata" operators with the opposite order of arguments. Therefore when "sizeof"\r
 498 is used in an expression it is equivalent to writing "0 metadataof" in its place.\r
 499 These operators have even higher precendence than their counterparts and are\r
 500 right-associative.\r
 501   The order of the terms of the linear polynomial depends on the way in which the value\r
 502 was constructed. Every arithmetic operation preserves the order of the terms in\r
 503 the first argument, and the terms that were not present in the first argument are\r
 504 attached at the end in the same order in which they occurred in the second argument.\r
 505 This order only matters when extracting terms with appropriate operators.\r
 506   The "elementsof" is another unary operator of the highest precedence, it\r
 507 counts the number of variable terms of a linear polynomial.\r
 508   An expression may also contain a literal value that defines a floating-point\r
 509 number. Such number must be in decimal notation, it may contain "." character\r
 510 as a decimal mark and may be followed by the "e" character and then a decimal\r
 511 value of the exponent (optionally preceded by "+" or "-" to mark the sign of\r
 512 exponent). When "." or "e" is present, it must be followed by at least\r
 513 one digit. The "f" character can be appended at the end of such literal value.\r
 514 If a number contains neither "." nor "e", the final "f" is the only way to\r
 515 ensure that it is treated as floating-point and not as a simple decimal\r
 516 integer.\r
 517   The floating-point numbers are handled by the assembler in the binary form.\r
 518 Their range and precision are at least as high as they are in the longest\r
 519 floating-point format that the assembler is able to produce in the output.\r
 520   Basic arithmetic operations are allowed to have a floating-point\r
 521 number as any of the arguments, but none of the arguments may contain\r
 522 a non-scalar (linear polynomial) terms then. The result of such operation is\r
 523 always a floating-point number.\r
 524   The unary "float" operator may be used to convert an integer value to \r
 525 floating-point. This operator has the highest precedence.\r
 526   The "trunc" is another unary operator with the highest precedence and it can be\r
 527 applied to floating-point numbers. It extracts the integer part of a number\r
 528 (it is a truncation toward zero) and the result is always a plain integer, not\r
 529 a floating-point number. If the argument was already a plain integer, this\r
 530 operation leaves it unchanged.\r
 531   The "bsr" operator can be applied to floating-point numbers and it returns\r
 532 the exponent of such number, which is the exponent of the largest power of\r
 533 two that is not larger than the given number. The sign of the floating-point value\r
 534 does not affect the result of this operation.   \r
 535   It is also allowed to use a floating-point number as the first argument\r
 536 to the "shl" and "shr" operators. The number is then multiplied or divided by the\r
 537 power of two specified by the second argument.\r
 538 \r
 539 \r
 540 5. Symbol classes\r
 541 \r
 542 There are three distinct classes of symbols, determining the position in\r
 543 source line at which the symbol may be recognized. A symbol belonging to the\r
 544 instruction class is recognized only when it is the first identifier of the\r
 545 command, while a symbol from the expression class is recognized only when used\r
 546 to provide a value of arguments to some command.\r
 547   All the types of definitions that were described in the earlier sections\r
 548 create the expression class symbols. The "label" and "restore" are examples\r
 549 of built-in symbols belonging to the instruction class.\r
 550   In any namespace it is allowed for symbols of different classes to share the\r
 551 same name, for example it is possible to define the instruction named "shl",\r
 552 while there is also an operator with the same name - but an operator belongs\r
 553 to the expression class.\r
 554   It is even possible for a single line to contain the same identifier \r
 555 meaning different things depending on its position:\r
 556 \r
 557         ?restore = 1\r
 558         restore restore ; remove the value of the expression-class symbol\r
 559 \r
 560   The third class of symbols are the labeled instructions. A symbol belonging\r
 561 to this class may be recognized only when the first identifier of the command\r
 562 is not an instruction - in such case the first identifier becomes a label to\r
 563 the instruction defined by the second one. If we treat "=" as a special kind\r
 564 of identifer, it may serve as an example of labeled instruction.\r
 565   The assembler contains built-in symbols of all classes. Their names are\r
 566 always case-insensitive and they may be redefined, but it is not possible to\r
 567 remove them. When all the values of such symbol are removed with a command\r
 568 like "restore", the built-in value persists.\r
 569   The rules concerning namespace apply equally to the symbols of all classes,\r
 570 for example symbol of instruction class belonging to the child namespace of\r
 571 latest label can be executed by preceding its name with dot. It should be\r
 572 noted, however, that when a namespace is specified through its parent symbol,\r
 573 it is always a symbol belonging to the expression class. It is not possible to\r
 574 refer to a child namespace of an instruction, only to the namespace belonging\r
 575 to the expression class symbol with the same name.\r
 576 \r
 577         xor?.mask? := 10101010b\r
 578         a = XOR.MASK    ; symbol in the namespace of built-in case-insensitive "XOR"\r
 579 \r
 580         label?.test? := 0\r
 581         a = LABEL.TEST  ; undefined unless "label?" is defined\r
 582 \r
 583 Here the namespace containing "test" belongs to an expression-class symbol,\r
 584 not to the existing instruction "label". When there is no expression-class symbol \r
 585 that would fit the "LABEL" specifier, the namespace chosen is the one that would\r
 586 belong to the case-sensitive symbol of such name. The "test" is therefore not found,\r
 587 because it has been defined in another namespace - the one of case-insensitive "label".\r
 588 \r
 589 \r
 590 6. Generating data\r
 591 \r
 592 The "db" instruction allows to generate bytes of data and put them into the\r
 593 output. It should be followed by one or more values, separated with commas.\r
 594 When the value is numeric, it defines a single byte. When the value is a\r
 595 string, it puts the string of bytes into output.\r
 596 \r
 597         db 'Hello',13,10        ; generate 7 bytes\r
 598 \r
 599 The "dup" keyword may be used to generate the same value multiple times. The\r
 600 "dup" should be preceded by numeric expression defining the number of\r
 601 repetitions, and the value to be repeated should follow. A sequence of values\r
 602 may also be duplicated this way, in such case "dup" should be followed by the\r
 603 entire sequence enclosed in parentheses (with values separated with commas).\r
 604 \r
 605         db 4 dup 90h            ; generate 4 bytes\r
 606         db 2 dup ('abc',10)     ; generate 8 bytes\r
 607 \r
 608   When a special identifier consisting of a lone "?" character is used as a\r
 609 value in the arguments to "db", it reserves a single byte. This advances the\r
 610 address in the output where the next data are going to be put, but the reserved\r
 611 bytes are not generated themselves unless they are followed by some other data.\r
 612 Therefore if the bytes are reserved at the end of output, they do not increase\r
 613 the size of generated file. This kind of data is called uninitialized, while\r
 614 all the regular data are said to be initialized.\r
 615   The "rb" instruction reserves a number of bytes specified by its argument.\r
 616 \r
 617         db ?                    ; reserve 1 byte\r
 618         rb 7                    ; reserve 7 bytes\r
 619 \r
 620   Every built-in instruction that generates data (traditionally called a data\r
 621 directive) is paired with a labeled instruction of the same name. Such command\r
 622 in addition to generating data defines a label at address of generated data,\r
 623 with associated size equal to the size of data unit used by this instruction.\r
 624 In case of "db" and "rb" this size is 1.\r
 625 \r
 626         some db sizeof some     ; generate a byte with value 1\r
 627 \r
 628   The "dw", "dd", "dp", "dq", "dt", "ddq", "dqq" and "ddqq" are instructions \r
 629 analogous to "db" with a different sizes of data unit. The order of bytes\r
 630 within a single generated unit is always little-endian. When a string of bytes\r
 631 is provided as the value to any of these instructions, the generated data\r
 632 is extended with zero bytes to the length which is the multiple of data unit.\r
 633 The "rw", "rd", "rp", "rq", "rt", "rdq", "rqq" and "rdqq" are the instructions\r
 634 that reserve a specified number of data units. The unit sizes associated with\r
 635 all these instructions are listed in table 1.\r
 636   The "dw", "dd", "dq", "dt" and "ddq" instructions allow floating-point\r
 637 numbers as data units. Any such number is then converted into floating-point \r
 638 format appropriate for a given size.\r
 639   The "emit" (with a synonym "dbx") is a data directive that uses the size\r
 640 of unit specified by its first argument to generate data defined by\r
 641 the remaining ones. The size may be separated from the next argument with\r
 642 a colon instead of a comma, for better readability. When the unit size\r
 643 is such that it has a dedicated data directive, the definition made with "emit"\r
 644 has the same effect as if these values were passed to the instruction tailored\r
 645 for this size.\r
 646 \r
 647         emit 2: 0,1000,2000      ; generate three 16-bit values\r
 648 \r
 649   The "file" instruction reads the data from an external file and writes it\r
 650 into output. The argument must be a string containing the path to the file, it\r
 651 may optionally be followed by ":" and the numeric value specifying an offset\r
 652 within the file, next it may be followed by comma and the numeric value\r
 653 specifying how many bytes to copy.\r
 654 \r
 655         file 'data.bin'                 ; insert entire file\r
 656         excerpt file 'data.bin':10h,4   ; insert selected four bytes\r
 657 \r
 658 \r
 659    Table 1   Data directives\r
 660   /------------------------------\\r
 661   | Size    | Generate | Reserve |\r
 662   | (bytes) | data     | data    |\r
 663   |=========|==========|=========|\r
 664   | 1       | db       | rb      |\r
 665   |         | file     |         |\r
 666   |---------|----------|---------|\r
 667   | 2       | dw       | rw      |\r
 668   |---------|----------|---------|\r
 669   | 4       | dd       | rd      |\r
 670   |---------|----------|---------|\r
 671   | 6       | dp       | rp      |\r
 672   |---------|----------|---------|\r
 673   | 8       | dq       | rq      |\r
 674   |---------|----------|---------|\r
 675   | 10      | dt       | rt      |\r
 676   |---------|----------|---------|\r
 677   | 16      | ddq      | rdq     |\r
 678   |---------|----------|---------|\r
 679   | 32      | dqq      | rqq     |\r
 680   |---------|----------|---------|\r
 681   | 64      | ddqq     | rdqq    |\r
 682   |---------|----------|---------|\r
 683   | *       | emit     |         |\r
 684   \------------------------------/\r
 685 \r
 686 \r
 687 7. Conditional assembly\r
 688 \r
 689 The "if" instruction causes a block of source text to be assembled only\r
 690 under certain condition, specified by a logical expression that is an argument\r
 691 to this instruction. The "else if" command in the following lines\r
 692 ends the previous conditionally assembled block and opens a new one, assembled\r
 693 only when the previous conditions were not met and the new condition (an\r
 694 argument to "else if") is true. The "else" command ends the previous\r
 695 conditionally assembled block and begins a block that is assembled only when\r
 696 none of the previous conditions was true. The "end if" command should be used\r
 697 to end the entire construction. There may be many or none "else if" commands\r
 698 inside and no more than one "else".\r
 699   A logical expression is a distinct syntactical entity from the basic\r
 700 expressions that were described earlier. A logical expression consists of\r
 701 logical values connected with logical operators. The logical operators are:\r
 702 unary "~" for negation, "&" for conjunction and "|" for alternative.\r
 703 The negation is evaluated first, while "&" and "|" are simply evaluated\r
 704 from left to right, with no precedence over each other.\r
 705   A logical value in its simplest form may be a basic expression, it then\r
 706 corresponds to true condition if and only if its value is not constant zero.\r
 707 Another way to create a logical value is to compare the values of two basic\r
 708 expressions with one of the following operators: "=" (equal), "<" (less than),\r
 709 ">" (greater than), "<=" (less or equal), ">=" (greater or equal),\r
 710 "<>" (not equal).\r
 711 \r
 712         count = 2\r
 713         if count > 1\r
 714                 db '0'\r
 715                 db count-1 dup ',0'\r
 716         else if count = 1\r
 717                 db '0'\r
 718         end if\r
 719 \r
 720   When linear polynomials are compared this way, the logical value is\r
 721 valid only when they are comparable, which is whey they differ in constant\r
 722 term only. Otherwise the condition like equality is neither universally true\r
 723 nor universally false, since it depends on the values substituted for variables,\r
 724 and assembler signals this as an error.\r
 725   The "relativeto" operator creates a logical value that is true only when\r
 726 the difference of compared values does not contain any variable terms. Therefore\r
 727 it can be used to check whether two linear polynomials are comparable - the\r
 728 "relativeto" condition is true only when both compared polynomials have the same\r
 729 variable terms.\r
 730   Because logical expressions are lazily evaluated, it is possible to create\r
 731 a single condition that will not cause an error when the polynomials are not\r
 732 comparable, but will compare them if they are:\r
 733 \r
 734         if a relativeto b & a > b\r
 735                 db a - b\r
 736         end if\r
 737 \r
 738   The "eqtype" operator can also be used to compare two basic expressions,\r
 739 it makes a logical value which is true when the values of the expressions are\r
 740 of the same type - either both are algebraic, both are strings or both are\r
 741 floating-point numbers. An algebraic type covers the linear polynomials and\r
 742 it includes the integer values.\r
 743   The "eq" operator compares two basic expressions and creates a logical value\r
 744 which is true only when their values are of the same type and equal. This operator\r
 745 can be used to check whether a value is a certain string, a certain floating-point\r
 746 number or a certain linear polynomial. It can compare values that are not\r
 747 comparable with "=" operator.\r
 748   The "defined" operator creates a logical value combined with a basic expression\r
 749 that follows it. This condition is true when the expression does not contain\r
 750 symbols that have no accessible definition. The expression is only tested for the\r
 751 availability of its components, it does not need to have a computable value.\r
 752 This can be used to check whether a symbol of expression class has been defined,\r
 753 but since the symbol can be accessible through forward-referencing, this condition\r
 754 may be true even when the symbol is defined later in source. If this is undesirable,\r
 755 the "definite" operator should be used instead, as it checks whether all symbols\r
 756 within a basic expression that follows have been defined earlier.\r
 757   The basic expression that follows "defined" is also allowed to be empty and\r
 758 the condition is then trivially satisfied. This does not apply to "definite".\r
 759   The "used" operator forms a logical value if it is followed by a single\r
 760 identifier. This condition is true when the value of specified symbol has\r
 761 been used anywhere in the source.\r
 762   The "assert" is an instruction that signalizes an error when a condition\r
 763 specified by its argument is not met.\r
 764 \r
 765         assert a < 65536\r
 766 \r
 767 \r
 768 8. Macroinstructions\r
 769 \r
 770 The "macro" command allows to define a new instruction, in form of a\r
 771 macroinstruction. The block of source text between the "macro" and\r
 772 "end macro" command becomes the text of macroinstruction and this sequence\r
 773 of lines is assembled in place of the original command that starts with\r
 774 identifier of instruction defined this way.\r
 775 \r
 776         macro null\r
 777                 db 0\r
 778         end macro\r
 779 \r
 780         null            ; "db 0" is assembled here\r
 781 \r
 782   The macroinstruction is allowed to have arguments only when the\r
 783 definition contains them. After the "macro" and the identifier of defined\r
 784 symbol optionally may come a list of simple names separated with commas,\r
 785 these names define the parameters of macroinstruction. When this instruction\r
 786 is then used, it may be followed by at most the same number of arguments\r
 787 separated with commas, and their values are assigned to the consecutive\r
 788 parameters. Before any line of text inside the macroinstruction is interpreted,\r
 789 the name tokens that correspond to any of the parameters are replaced with their\r
 790 assigned values.\r
 791 \r
 792         macro lower name,value\r
 793                 name = value and 0FFh\r
 794         end macro\r
 795 \r
 796         lower a,123h    ; a = 23h\r
 797 \r
 798 The value of a parameter can be any text, not necessarily a correct expression.\r
 799 If a line calling the macroinstruction contains fewer arguments than the\r
 800 number of defined parameters, the excess parameters receive the empty values.\r
 801   When a name of a parameter is defined, it may be followed by "?" character\r
 802 to denote that it is case-insensitive, analogously to a name in a symbol\r
 803 identifier. There must be no whitespace between the name and "?".\r
 804 A definition of a parameter may also be followed by "*" to denote that it\r
 805 requires a value that is not empty, or alternatively by ":" character\r
 806 followed by a default value, which is assigned to the parameter instead of\r
 807 an empty one when no other value is provided.\r
 808 \r
 809         macro prepare name*,value:0\r
 810                 name = value\r
 811         end macro\r
 812 \r
 813         prepare x       ; x = 0\r
 814         prepare y,1     ; y = 1\r
 815 \r
 816   If an argument to macroinstruction needs to contain a comma character, the\r
 817 entire argument must be enclosed between the "<" and ">" characters (they do\r
 818 not become a part of the value). If another "<" character is encountered inside\r
 819 such value, it must be balanced with corresponding ">" character inside the\r
 820 same value.\r
 821 \r
 822         macro data name,value\r
 823                 name:\r
 824                 .data db value\r
 825                 .end:\r
 826         end macro\r
 827 \r
 828         data example, <'abc',10>\r
 829 \r
 830   The last defined parameter may be followed by "&" character to denote that\r
 831 this parameter should be assigned a value containing the entire remaining\r
 832 part of line, even if it normally would define multiple arguments. Therefore\r
 833 when macroinstruction has just one parameter followed by "&", the value of\r
 834 this parameter is the entire text of arguments following the instruction.\r
 835 \r
 836         macro id first,rest&\r
 837                 dw first\r
 838                 db rest\r
 839         end macro\r
 840 \r
 841         id 2, 7,1,8\r
 842 \r
 843   When a name of a parameter is to be replaced with its value and it is\r
 844 preceded by "`" character (without any whitespace inbetween), the text of\r
 845 the value is embedded into a quoted string and this string replaces\r
 846 both the "`" character and the name of parameter.\r
 847 \r
 848         macro text line&\r
 849                 db `line\r
 850         end macro\r
 851 \r
 852         text x+1        ; db 'x+1'\r
 853 \r
 854   The "local" is a command that may only be used inside a macroinstruction.\r
 855 It should be followed by one or more names separated with commas, and it\r
 856 declares that the names from this list should in the context of current\r
 857 macroinstruction be interpreted as belonging to a special namespace\r
 858 associated with this macroinstruction instead of current base namespace. This\r
 859 allows to create unique symbols every time the macroinstruction is called.\r
 860 Such declaration defines additional parameters with the specified names and\r
 861 therefore only affects the uses of those names that follow within the same\r
 862 macroinstruction. Declaring the same name as local multiple times within\r
 863 the same macroinstruction gives no additional effect.\r
 864 \r
 865         macro measured name,string\r
 866                 local top\r
 867                 name db string\r
 868                 top: name.length = top - name\r
 869         end macro\r
 870 \r
 871         measured hello, 'Hello!'        ; hello.length = 6\r
 872 \r
 873 A parameter created with "local" becomes replaced with a text that contains\r
 874 the same name as the name of parameter, but has added context information\r
 875 that causes it to be identified as belonging to the unique local namespace\r
 876 associated with the instance of macroinstruction. This kind of context \r
 877 information is going to be discussed further in the section about\r
 878 symbolic variables.\r
 879   A symbol that is local to a macroinstruction is never considered the most\r
 880 recent label that is base for symbols starting with dot. Moreover, its\r
 881 descendant namespace is disconnected from the main tree of symbols, so if\r
 882 "namespace" command was used with a local symbol as the argument, symbols\r
 883 from the main tree would no longer be visible (including all the named \r
 884 instructions of the assembler, even commands like "end namespace").\r
 885   Just like an expression symbol may be redefined and refer to its previous\r
 886 value in the definition of the new one, the macroinstructions can also be\r
 887 redefined, and use the previous value of this instruction symbol in its\r
 888 text:\r
 889 \r
 890         macro zero\r
 891                 db 0\r
 892         end macro\r
 893 \r
 894         macro zero name\r
 895                 label name:byte\r
 896                 zero\r
 897         end macro\r
 898 \r
 899         zero x\r
 900 \r
 901 And just like other symbols, a macroinstruction may be forward-referenced when\r
 902 it is defined exactly once in the entire source.\r
 903   The "purge" command discards the definition of a symbol just like "restore",\r
 904 but it does so for the symbol of instruction class. It behaves in the same\r
 905 way as "restore" in all the other aspects. A macroinstruction can remove its\r
 906 own definition with "purge".\r
 907   It is possible for a macroinstruction to use its own value in a recursive way,\r
 908 but to avoid inadvertent infinite recursion this feature is only available when\r
 909 the macroinstruction is marked as such by following its identifier with ":"\r
 910 character.\r
 911 \r
 912         macro factorial: n\r
 913                 if n\r
 914                         factorial n-1\r
 915                         result = result * (n)\r
 916                 else\r
 917                         result = 1\r
 918                 end if\r
 919         end macro\r
 920 \r
 921 In addition to allowing recursion, such macroinstruction behaves like a constant.\r
 922 It cannot be redefined and "purge" cannot be applied to it.\r
 923   A macroinstruction may in turn define another macroinstruction or a number\r
 924 of them. The blocks designated by "macro" and "end macro" must be properly\r
 925 nested one within the other for such definition to be accepted by the\r
 926 assembler.\r
 927 \r
 928         macro enum enclosing\r
 929                 counter = 0\r
 930                 macro item name\r
 931                         name := counter\r
 932                         counter = counter + 1\r
 933                 end macro\r
 934                 macro enclosing\r
 935                         purge item,enclosing\r
 936                 end macro\r
 937         end macro\r
 938 \r
 939         enum x\r
 940                 item a\r
 941                 item b\r
 942                 item c\r
 943         x\r
 944 \r
 945   When it is required that macroinstruction generates unpaired "macro" or\r
 946 "end macro" command, it can be done with special "esc" instruction. Its\r
 947 argument becomes a part of macroinstruction, but is not being taken into\r
 948 account when counting the nested "macro" and "end macro" pairs.\r
 949 \r
 950         macro xmacro name\r
 951                 esc macro name x&\r
 952         end macro\r
 953 \r
 954         xmacro text\r
 955                 db `x\r
 956         end macro\r
 957 \r
 958 If "esc" is placed inside a nested definition, it is not processed out until\r
 959 the innermost macroinstruction becomes defined. This allows a definition\r
 960 containing "esc" to be placed inside another macroinstruction without having\r
 961 to repeat "esc" for every nesting level.\r
 962   When an identifer of macroinstruction in its definition is followed by "!"\r
 963 character, it defines an unconditional macroinstruction. This is a special\r
 964 kind of instruction class symbol, which is evaluated even in places where the\r
 965 assembly is suspended - like inside a conditional block whose condition is\r
 966 false, or inside a definition of another macroinstruction. This allows to\r
 967 define instructions that can be used where otherwise a directly stated\r
 968 "end if" or "end macro" would be required, as in the following example:\r
 969 \r
 970         macro proc name\r
 971                 name:\r
 972                 if used name\r
 973         end macro\r
 974 \r
 975         macro endp!\r
 976                 end if\r
 977                 .end:\r
 978         end macro\r
 979 \r
 980         proc tester\r
 981                 db ?\r
 982         endp\r
 983 \r
 984 If the macroinstruction "endp" in the above sample was not defined as an\r
 985 unconditional one and the block started with "if" was being skipped, the\r
 986 macroinstruction would not get evaluated, and this would lead to an error\r
 987 because "end if" would be missing.\r
 988   It should be noted that "end" command executes an instruction identified\r
 989 by its argument in the child namespace of case-insensitive "end" symbol. \r
 990 Therefore command like "end if" could be alternatively invoked with\r
 991 an "end.if" identifier, and it is possible to override any such instruction\r
 992 by redefining a symbol in the "end?" namespace. Moreover, any instruction\r
 993 defined within the "end?" namespace can then be called with the "end" command.\r
 994 This slighly modified variant of the above sample puts these facts to use:\r
 995 \r
 996         macro proc name\r
 997                 name:\r
 998                 if used name\r
 999         end macro\r
1000 \r
1001         macro end?.proc!\r
1002                 end if\r
1003                 .end:\r
1004         end macro\r
1005 \r
1006         proc tester\r
1007                 db ?\r
1008         end proc\r
1009 \r
1010 A similar rule applies to the "else" command and the instructions in the\r
1011 "else?" namespace.\r
1012   When an identifier consisting of a lone "?" character is used as an\r
1013 instruction symbol in the definition of macroinstruction, it defines a special\r
1014 instruction that is then called every time a line to be assembled does not\r
1015 contain an unconditional instruction, and the complete text of line becomes\r
1016 the arguments to this macroinstruction. This special symbol can also be defined\r
1017 as an unconditional instruction, and then it is called for every following line\r
1018 with no exception. This allows to completely override the assembly process on\r
1019 portions of the text. The following sample defines a macroinstruction which\r
1020 allows to define a block of comments by skiping all the lines of text until it\r
1021 encounters a line with content equal to the argument given to "comment".\r
1022 \r
1023         macro comment? ender\r
1024                 macro ?! line&\r
1025                         if `line = `ender\r
1026                                 purge ?\r
1027                         end if\r
1028                 end macro\r
1029         end macro\r
1030 \r
1031         comment ~\r
1032                  Any text may follow here.\r
1033         ~\r
1034 \r
1035   The "mvmacro" is an instruction that takes two arguments, both identifying\r
1036 an instruction-class symbols. The definition of a macroinstruction specified\r
1037 by the second argument is moved to the symbol identified by the first one.\r
1038 For the second symbol the effect of this command is the same as of "purge".\r
1039 This allows to effectively rename a macroinstruction, or temporarily disable it\r
1040 only to bring it back later. The symbols affected by this operation become\r
1041 variables and cannot be forward-referenced.\r
1042 \r
1043 \r
1044 9. Labeled macroinstructions\r
1045 \r
1046 The "struc" command allows to define a labeled instruction, in form of a\r
1047 macroinstruction. Except for the fact that such definition must be closed\r
1048 with "end struc" instead of "end macro", these macroinstructions are defined\r
1049 in the same way as with "macro" command. A labeled instruction is evaluated\r
1050 when the first identifier of a command is not an instruction and the second\r
1051 identifier is of the labeled instruction class:\r
1052 \r
1053         struc some\r
1054                 db 1\r
1055         end struc\r
1056 \r
1057         get some        ; "db 1" is assembled here\r
1058 \r
1059   Inside a labeled macroinstruction identifiers starting with dot no longer\r
1060 refer to the namespace of a previously defined regular label. Instead they\r
1061 refer to the namespace of label with which the instruction was labeled.\r
1062 \r
1063         struc POINT\r
1064                 label . : qword\r
1065                 .x dd ?\r
1066                 .y dd ?\r
1067         end struc\r
1068 \r
1069         my POINT        ; defines my.x and my.y\r
1070 \r
1071 Note that the parent symbol, which can be refered by "." identifier, is not\r
1072 defined unless an appropriate definition is generated by the macroinstruction.\r
1073 Furthermore, this symbol is not considered the most recent label in\r
1074 the surrounding namespace unless it gets defined as an actual label in\r
1075 the macroinstruction it labeled.\r
1076   For an easier use of this feature, other syntaxes may be defined with\r
1077 macroinstructions, like in this sample:\r
1078 \r
1079         macro struct? definition&\r
1080                 esc struc definition\r
1081                         label . : .%top - .\r
1082                         namespace .\r
1083         end macro\r
1084 \r
1085         macro ends?!\r
1086                                 %top:\r
1087                         end namespace\r
1088                 esc end struc\r
1089         end macro\r
1090 \r
1091         struct POINT vx:?,vy:?\r
1092                 x dd vx\r
1093                 y dd vy\r
1094         ends\r
1095 \r
1096         my POINT 10,20\r
1097 \r
1098   The "restruc" command is analogous to "purge", but it operates on symbols\r
1099 from the class of labeled instructions. Similarly, the "mvstruc" command is \r
1100 the same as "mvmacro" but for labeled instructions.\r
1101   As with "macro", it is possible to use an identifier consisting of a lone "?"\r
1102 character with "struc". It defines a special labeled macroinstruction that is\r
1103 called every time the first symbol of a line is not recognized as an instruction.\r
1104 Everything that follows that first identifier becomes the arguments to labeled\r
1105 macroinstruction. The following sample uses this feature to catch any orphaned\r
1106 labels (the ones that are not followed by any character) and treat them as regular\r
1107 ones instead of causing an error. It achieves it by making ":" the default value\r
1108 for "def" parameter:\r
1109 \r
1110         struc ? def::&\r
1111                 . def\r
1112         end struc\r
1113 \r
1114         orphan\r
1115         regular:\r
1116         assert orphan = regular\r
1117 \r
1118 Similarly to "macro" this special variant does not override unconditional labeled\r
1119 instructions unless it is unconditional itself.\r
1120   While "." provides an efficient method of accessing the label symbol, \r
1121 sometimes it may be needed to process the actual text of the label.\r
1122 A special parameter can be defined for this purpose and its name should be\r
1123 inserted enclosed in parentheses before the name of labeled macroinstruction:\r
1124 \r
1125         struc (name) SYMBOL\r
1126                 . db `name,0\r
1127         end struc\r
1128 \r
1129         test SYMBOL\r
1130 \r
1131 \r
1132 10. Symbolic variables and recognition context\r
1133 \r
1134 The "equ" is a built-in labeled instruction that defines symbol of expression\r
1135 class with a symbolic value. Such value can contain any text (even an empty\r
1136 one) and when it is used in an expression it is equivalent to inserting\r
1137 the text of its value in place of its identifier, with an effect similar to\r
1138 evaluation of a parameter of macroinstruction.\r
1139   This can lead to a different results than when a standard variable defined\r
1140 with "=" is used, as the following example demonstrates:\r
1141 \r
1142         numeric = 2 + 2\r
1143         symbolic equ 2 + 2\r
1144         x = numeric*3           ; x = 4*3\r
1145         y = symbolic*3          ; y = 2 + 2*3\r
1146 \r
1147 While "x" is assigned the value of 12, the value of "y" is 8. This shows that\r
1148 the use of such symbols can lead to unintended interactions and therefore\r
1149 definitions of this type should be avoided unless really necessary.\r
1150   The "equ" allows redefinitions, and it preserves the previous value of\r
1151 symbol analogously to the "=:" command, so the earlier value can be brought\r
1152 back with "restore" instruction. To replace the symbolic value (analogously\r
1153 to how "=" overwrites the regular value) the "reequ" command should be used\r
1154 instead of "equ".\r
1155   A symbolic value, in addition to retaining the exact text it was defined\r
1156 with, preserves the context in which the symbols contained in this text are\r
1157 to be interpreted. Therefore it can effectively become a reliable link to\r
1158 value of some other symbol, lasting even when it is used in a different\r
1159 context (this includes change of the base namespace or a symbol referred by\r
1160 a starting dot):\r
1161 \r
1162         first:\r
1163                 .x = 1\r
1164                 link equ .x\r
1165                 .x = 2\r
1166         second:\r
1167                 .x = 3\r
1168                 db link         ; db 2\r
1169 \r
1170   It should be noted that the same process is applied to the arguments of any\r
1171 macroinstruction when they become preprocessed parameters. If during\r
1172 the execution of a macroinstruction the context changes, the identifiers\r
1173 within the text of parameters still refer to the same symbols as in the line\r
1174 that called the instruction:\r
1175 \r
1176         x = 1\r
1177         namespace x\r
1178                 x = 2\r
1179         end namespace\r
1180         macro prodx value\r
1181                 namespace x\r
1182                         db value*x\r
1183                 end namespace\r
1184         end macro\r
1185         prodx x         ; db 1*2\r
1186 \r
1187 Furthermore, parameters defined with "local" command use the same mechanism\r
1188 to alter the context in which given name is interpreted, without altering\r
1189 the text of the name. However, such modified context is not relevant\r
1190 if the value of parameter is inserted in a middle or at the end of\r
1191 a complex identifier, because it is the structure of an identifier that\r
1192 dictates how its later parts are interpreted and only the context for an\r
1193 initial part matters. For example, prepending a name of a parameter with\r
1194 "#" character is going to cause the identifier to use current context instead\r
1195 of context carried by the text of that parameter, because initial context\r
1196 for the identifier is then the context associated with text "#".\r
1197   Unlike the value of a symbolic variable, the body of a macroinstruction\r
1198 by itself carries no context (although it may contain snippets of text that\r
1199 came from replaced parameters and because of that have some context associated\r
1200 with them). Also, if a macroinstruction becomes unrolled at the time when\r
1201 another one is being defined (this can only happen when called macroinstruction\r
1202 is unconditional), no context information is added to the arguments, to aid in\r
1203 preservation of this context-lessness.\r
1204   If the text following "equ" contains identifiers of known symbolic variables, \r
1205 each of them is replaced with its contents and it is such processed text that\r
1206 gets assigned to the newly defined symbol.\r
1207   The "define" is a regular instruction that also creates a symbolic value,\r
1208 but as opposed to "equ" it does not evaluate symbolic variables in the\r
1209 assigned text. It should be followed by an identifier of symbol to be defined\r
1210 and then by the text of the value.\r
1211   The difference between "equ" and "define" is often not noticeable, because\r
1212 when used in final expression the symbolic variables are nestedly evaluated\r
1213 until only the usable constituents of expressions are left. A possible use of\r
1214 "define" is to create a link to another symbolic variable, like the following\r
1215 example demonstrates:\r
1216 \r
1217         a equ 0*\r
1218         x equ -a\r
1219         define y -a\r
1220         a equ 1*\r
1221         db x 2          ; db -0*2\r
1222         db y 2          ; db -1*2\r
1223 \r
1224 The other uses of "define" will arise in the later sections, with the\r
1225 introduction of other instructions that operate on symbolic values.\r
1226   The "define", like "equ", preserves the previous value of symbol. The\r
1227 "redefine" is a variant of this instruction that discards the earlier value,\r
1228 analogously to "reequ".\r
1229   Note that while symbolic variables belong to the expression class of symbols,\r
1230 their state cannot be determined with operators like "defined", "definite", \r
1231 or "used", because a logical expression is evaluated as if every symbolic\r
1232 variable was replaced with the text of corresponding value. Therefore operator\r
1233 followed by an identifer of symbolic variable is going to be applied to\r
1234 the content of this variable, whatever it is. For example if a symbolic variable\r
1235 is made which is a link to a regular symbol, then any operator like "defined"\r
1236 followed by the identifier of said symbolic variable is going to determine\r
1237 the status of a linked symbol, not a linking variable.\r
1238 \r
1239 \r
1240 11. Repeating blocks of instructions\r
1241 \r
1242 The "repeat" instruction allows to assemble a block of instructions multiple\r
1243 times, with the number of repetitions specified by the value of its argument.\r
1244 The block of instructions should be ended with "end repeat" command. A synonym \r
1245 "rept" can be used instead of "repeat".\r
1246 \r
1247         a = 2\r
1248         repeat a + 3\r
1249                 a = a + 1\r
1250         end repeat\r
1251         assert a = 7\r
1252 \r
1253   The "while" instruction causes the block of instructions to be assembled\r
1254 repeatedly as long as the condition specified by its argument is true. Its\r
1255 argument should be a logical expression, like an argument for "if" or\r
1256 "assert". The block should be closed with "end while" command.\r
1257 \r
1258         a = 7\r
1259         while a > 4\r
1260                 a = a - 2\r
1261         end while\r
1262         assert a = 3\r
1263 \r
1264   The "%" is a special parameter which is preprocessed inside the repeated\r
1265 block of instructions and is replaced with a decimal number being the number\r
1266 of current repetition (starting with 1). It works in a similar way to a\r
1267 parameter of macroinstruction, so it is replaced with its value before the\r
1268 actual command is processed and so it can be used to create symbol\r
1269 identifiers containing the number as a part of name:\r
1270 \r
1271         repeat 16\r
1272                 f#% = 1 shl %\r
1273         end repeat\r
1274 \r
1275 The above example defines symbols "f1" to "f16" with values being the\r
1276 consecutive powers of two.\r
1277   The "repeat" instruction can have additional arguments, separated with\r
1278 commas, each containing a name of supplementary parameters specific to this\r
1279 block. Each of the names can be followed by ":" character and the expression\r
1280 specifying the base value from which the parameter is going to start counting\r
1281 the repetitions. This allows to easily change the previous sample to define\r
1282 the range of symbols from "f0" to "f15":\r
1283 \r
1284         repeat 16, i:0\r
1285                 f#i = 1 shl i\r
1286         end repeat\r
1287 \r
1288   The "%%" is another special parameter that has a value equal to the total\r
1289 number of repetitions planned. This parameter is undefined inside the "while"\r
1290 block. The following example uses it to create the sequence of bytes with\r
1291 values descending from 255 to 0:\r
1292 \r
1293         repeat 256\r
1294                 db %%-%\r
1295         end repeat\r
1296 \r
1297   The "break" instruction allows to stop the repeating prematurely. When it\r
1298 is encountered, it causes the rest of repeated block to be skipped and no\r
1299 further repetitions to be executed. It can be used to stop the repeating if\r
1300 a certain condition is met:\r
1301 \r
1302         s = x/2\r
1303         repeat 100\r
1304                 if x/s = s\r
1305                         break\r
1306                 end if\r
1307                 s = (s+x/s)/2\r
1308         end repeat\r
1309 \r
1310 The above sample tries to find the square root of the value of symbol "x",\r
1311 which is assumed defined elsewhere. It can easily be rewritten to perform the\r
1312 same task with "while" instead of "repeat":\r
1313 \r
1314         s = x/2\r
1315         while x/s <> s\r
1316                 s = (s+x/s)/2\r
1317                 if % = 100\r
1318                         break\r
1319                 end if\r
1320         end while\r
1321 \r
1322   The "iterate" instruction (with a synonym "irp") repeats the block of \r
1323 instructions while iterating through the list of values separated with commas. \r
1324 The first argument to "iterate" should be the a name of parameter, folowed by \r
1325 the comma and then a list of values. During each iteration the parameter\r
1326 receives one of the values from the list.\r
1327 \r
1328         iterate value, 1,2,3\r
1329                 db value\r
1330         end iterate\r
1331 \r
1332 Like it is in the case of an argument to macroinstruction, the value of parameter\r
1333 that contains commas needs to be enclosed with "<" and ">" characters. It is\r
1334 also possible to enclose the first argument to "iterate" with "<" and ">", in\r
1335 order to define multiple parameters. The list of values is then divided\r
1336 into section containing as many values as there are parameters, and each\r
1337 iteration operates on one such section, assigning to each parameter a\r
1338 corresponding value:\r
1339 \r
1340         iterate <name,value>, a,1, b,2, c,3\r
1341                 name = value\r
1342         end iterate\r
1343 \r
1344 The name of a parameter can also, like in the case of macroinstructions, be\r
1345 followed by "*" to require that the parameter has a value that is not empty,\r
1346 or ":" and a default value. If an "iterate" statement ends with a comma not\r
1347 followed by anything else, it is not interpreted as an additional empty value,\r
1348 to put a blank value at the end of list an empty enclosing "<>" needs to be used.\r
1349   The "break" instruction plus both the "%" and "%%" parameters can be used\r
1350 inside the "iterate" block with the same effects as in case of "repeat".\r
1351   The "indx" is an instruction that can be only be used inside an iterated\r
1352 block and it changes the values of all the iterated parameters to the ones\r
1353 corresponding to iteration with number specified by the argument to "indx" (but\r
1354 when the next iteration is started, the values of parameters are again assigned\r
1355 the normal way). This allows to process the iterated values in a different\r
1356 order. In the following example the values are processed from the last to the\r
1357 first:\r
1358 \r
1359         iterate value, 1,2,3\r
1360                 indx 1+%%-%\r
1361                 db value\r
1362         end iterate\r
1363 \r
1364 With "indx" it is even possible to move the view of iterated values many times\r
1365 during the single repetition. In the following example the entire processing\r
1366 is done during the first repetition of iterated block and then the "break"\r
1367 instruction is used to prevent further iterations:\r
1368 \r
1369         iterate str, 'alpha','beta','gamma'\r
1370                 repeat %%\r
1371                         dw offset#%\r
1372                 end repeat\r
1373                 repeat %%\r
1374                         indx %\r
1375                         offset#% db str\r
1376                 end repeat\r
1377                 break\r
1378         end iterate\r
1379 \r
1380   The parameters defined by "iterate" do not attach the context to iterated\r
1381 values, but neither do they remove the original context if such is already\r
1382 attached to the text of arguments. So if the values given to "iterate" were\r
1383 themselves created from another parameter that preserved the original context\r
1384 for the symbol identifiers (like the parameter of macroinstruction), then this\r
1385 context is preserved, but otherwise "iterate" defines just a plain text\r
1386 substitution.\r
1387   The parameters defined by instructions like "iterate" or "repeat" are\r
1388 processed everywhere in the text of associated block, but with some limitations\r
1389 if the block is defined partly by the text of macroinstruction and partly in \r
1390 other places. In that case the parameters are only accessible in the parts of \r
1391 the block that are defined in the same place as the initial command.\r
1392   Every time a parameter is defined, its name can have the "?" character\r
1393 attached to it to indicate that this parameter is case-insensitive. However\r
1394 when parameters are recognized inside the preprocessed line, it does not matter\r
1395 whether they are followed by "?" there. The only modifier that is recognized\r
1396 by preprocessor when it replaces the parameter with its value is the "`"\r
1397 character.\r
1398   The repeating instructions together with "if" belong to a group called\r
1399 control directives. They are the instructions that control the flow of\r
1400 assembly. Each of them defines its own block of subordinate instructions,\r
1401 closed with corresponding "end" command, and if these blocks are nested within\r
1402 each other, it always must be a proper nesting - the inner block must always\r
1403 be closed before the outer one. All control directives are therefore the\r
1404 unconditional instructions - they are recognized even when they are inside\r
1405 an otherwise skipped block.\r
1406   The "postpone" is another control directive, which causes a block of\r
1407 instructions to be assembled later, when all of the following source text\r
1408 has already been processed.\r
1409 \r
1410         dw final_count\r
1411         postpone\r
1412                 final_count = counter\r
1413         end postpone\r
1414         counter = 0\r
1415 \r
1416 The above sample postpones the definition of "final_count" symbol until the\r
1417 entire source has been processed, so that it can access the final value of\r
1418 "counter" variable.\r
1419   The assembly of the source text that follows "postpone" includes the assembly\r
1420 of any additional blocks declared with "postpone", therefore if there are\r
1421 multiple such blocks, they are assembled in the reverse order. The one that\r
1422 was declared last is assembled first when the end of the source text is reached.\r
1423   When the "postpone" directive is provided with an argument consisting of\r
1424 a single "?" character, it tells the assembler that the block contains\r
1425 operations which should not affect any of the values defined in the main\r
1426 source and thus the assembler may refrain from evaluating them until all\r
1427 other values have been successfully resolved. Such blocks are processed\r
1428 even later than the ones declared by "postpone" with no arguments. They\r
1429 may be used to perform some finalizing tasks, like the computation of a\r
1430 checksum of the assembled code.\r
1431   The "irpv" is another repeating instruction and an iterator. It has just two\r
1432 arguments, first being a name of parameter and second an identifier of \r
1433 a variable. It iterates through all the stacked values of symbolic\r
1434 variable, starting from the oldest one (this applies only to the values\r
1435 defined earlier in the source).\r
1436 \r
1437         var equ 1\r
1438         var equ 2\r
1439         var equ 3\r
1440         var reequ 4\r
1441         irpv param, var\r
1442                 db param\r
1443         end irpv\r
1444 \r
1445 In the above example there are three iterations, with values 1, 2, and 4.\r
1446   "irpv" can effectively convert a value of symbolic variable into a parameter,\r
1447 and this can be useful all by itself, because the symbolic variable is only\r
1448 evaluated in the expressions inside the arguments of instructions (labeled or\r
1449 not), while the parameters are preprocessed in the entire line before any\r
1450 processing of command is started. This allows, for example, to redefine a\r
1451 regular value that is linked by symbolic variable:\r
1452 \r
1453         x = 1\r
1454         var equ x\r
1455         irpv symbol, var\r
1456                 indx %%\r
1457                 symbol = 2\r
1458                 break\r
1459         end irpv\r
1460         assert x = 2\r
1461 \r
1462 The combination of "indx" and "break" was added to the above sample to limit\r
1463 the iteration to the latest value of symbolic variable. In the next section\r
1464 a better solution to the same problem will be presented.\r
1465   When a variable passed to "irpv" has a value that is not symbolic, the\r
1466 parameter is given a text that produces the same value upon computation. When\r
1467 the value is a positive number, the parameter is replaced with its decimal\r
1468 representation (similarly how the "%" parameter is processed), otherwise\r
1469 the parameter is replaced with an identifier of a proxy symbol holding the\r
1470 value from stack.\r
1471   The "outscope" directive is available while any macroinstruction is processed,\r
1472 and it modifies the command that follows in the same line. If the command causes\r
1473 any parameters to be defined, they are created not in the context of currently\r
1474 processed macroinstruction but in the context of the source text that called it.\r
1475 \r
1476         macro irpv?! statement&\r
1477                 display 'IRPV wrapper'\r
1478                 esc outscope irpv statement\r
1479         end macro\r
1480 \r
1481 This allows not only to safely wrap some control directives in macroinstructions,\r
1482 but also to create additional customized language constructions that define\r
1483 parameters for a block of text. Because "outscope" needs to be present in the\r
1484 text of a specific macroinstruction that requires it, it is recommended to use\r
1485 it in conjunction with "esc" as in the example above, this ensures that it is\r
1486 handled the same way even when the entire definition is put inside another\r
1487 macroinstruction.\r
1488 \r
1489 \r
1490 12. Matching parameters\r
1491 \r
1492 The "match" is a control directive which causes its block of instructions to\r
1493 be assembled only when the text specified by its second argument matches the\r
1494 pattern given by the first one. A text is separated from a pattern with a comma\r
1495 character, and it includes everything that follows this separator up to the end\r
1496 of line.\r
1497   Every special character (except for the "," and "=", which have a specific\r
1498 meaning in the pattern) is matched literally - it must be paired with identical\r
1499 token in the text. In the following example the content of the first block\r
1500 is assembled, while the content of the second one is not.\r
1501 \r
1502         match +,+\r
1503                 assert 1        ; positive match\r
1504         end match\r
1505 \r
1506         match +,-\r
1507                 assert 0        ; negative match\r
1508         end match\r
1509 \r
1510   The quoted strings are also matched literally, but name tokens in the pattern\r
1511 are treated differently. Every name acts as a wildcard and can match any\r
1512 sequence of tokens which is not empty. If the match is successful, the\r
1513 parameters with such names are created, and each is assigned a value equal\r
1514 to the text the wildcard was matched with.\r
1515 \r
1516         match a[b], 100h[3]\r
1517                 dw a+b          ; dw 100h+3\r
1518         end match\r
1519 \r
1520   A parameter name in pattern can have an extra "?" character attached to it\r
1521 to indicate that it is a case-insensitive name.\r
1522   The "=" character causes the token that follows it to be matched literally.\r
1523 It allows to perform matching of name tokens, and also of special characters\r
1524 that would otherwise have a different meaning, like "," or "=", or "?" following\r
1525 a name.\r
1526 \r
1527         match =a==a, a=8\r
1528                 db a            ; db 8\r
1529         end match\r
1530 \r
1531   If "=" is followed by name token with "?" character attached to it, this\r
1532 element is matched literally but in a case-insensitive way:\r
1533 \r
1534         match =a?==a, A=8\r
1535                 db a            ; db 8\r
1536         end match\r
1537 \r
1538   When there are many wildcards in the pattern, each consecutive one is matched\r
1539 with as few tokens as possible and the last one takes what is left. If the\r
1540 wildcards follow each other without any literally matched elements between\r
1541 them, the first one is matched with just a single token, and the second one with\r
1542 the remaining text:\r
1543 \r
1544         match car cdr, 1+2+3\r
1545                 db car          ; db 1\r
1546                 db cdr          ; db +2+3\r
1547         end match\r
1548 \r
1549 In the above sample the matched text must contain at least two tokens, because\r
1550 each wildcard needs at least one token to be not empty. In the next example\r
1551 there are additional constraints, but the same general rules applies and the\r
1552 first wildcard consumes as little as possible:\r
1553 \r
1554         match first:rest, 1+2:3+4:5+6\r
1555                 db `first       ; db '1+2'\r
1556                 db 13,10\r
1557                 db `rest        ; db '3+4:5+6'\r
1558         end match\r
1559 \r
1560   While any whitespace next to a wildcard is ignored, the presence or\r
1561 absence of whitespace between literally matched elements is meaningful.\r
1562 If such elements have no whitespace between them, their counterparts must\r
1563 contain no whitespace between them either. But if there is a whitespace\r
1564 between elements in pattern, it places no constraints on the use of\r
1565 whitespace in the corresponding text - it can be present of not.\r
1566 \r
1567         match ++,++\r
1568                 assert 1        ; positive match\r
1569         end match\r
1570 \r
1571         match ++,+ +\r
1572                 assert 0        ; negative match\r
1573         end match\r
1574 \r
1575         match + +,++\r
1576                 assert 1        ; positive match\r
1577         end match\r
1578 \r
1579         match + +,+ +\r
1580                 assert 1        ; positive match\r
1581         end match\r
1582 \r
1583 The presence of whitespace in the text becomes required when the pattern\r
1584 contains the "=" character followed by a whitespace:\r
1585 \r
1586         match += +, ++\r
1587                 assert 0        ; negative match\r
1588         end match\r
1589 \r
1590         match += +, + +\r
1591                 assert 1        ; positive match\r
1592         end match\r
1593 \r
1594   The "match" command is analogous to "if" in that it allows to use the\r
1595 "else" or "else match" to create a selection of blocks from which only one is\r
1596 executed:\r
1597 \r
1598         macro let param\r
1599                 match dest+==src, param\r
1600                         dest = dest + src\r
1601                 else match dest-==src, param\r
1602                         dest = dest + src\r
1603                 else match dest++, param\r
1604                         dest = dest + 1\r
1605                 else match dest--, param\r
1606                         dest = dest + 1\r
1607                 else match dest==src, param\r
1608                         dest = src\r
1609                 else\r
1610                         assert 0\r
1611                 end match\r
1612         end macro\r
1613 \r
1614         let x=3                 ; x = 3\r
1615         let x+=7                ; x = x + 7\r
1616         let x++                 ; x = x + 1\r
1617 \r
1618 It is even possible to mix "if" and "match" conditions in a sequence of\r
1619 "else" blocks. The entire construction must be closed with "end" command\r
1620 corresponding to whichever of the two was used last:\r
1621 \r
1622         macro record text\r
1623                 match any, text\r
1624                         recorded equ `text\r
1625                 else if RECORD_EMPTY\r
1626                         recorded equ ''\r
1627                 end if\r
1628         end macro\r
1629 \r
1630   The "match" is able to recognize symbolic variables and before the matching\r
1631 is started, their identifiers in the text of the second argument are replaced\r
1632 with corresponding values (just like they are replaced in the text that follows\r
1633 the "equ" command):\r
1634 \r
1635         var equ 2+3\r
1636 \r
1637         match a+b, var\r
1638                 db a xor b\r
1639         end match\r
1640 \r
1641 This means that the "match" can be used instead of "irpv" to convert the\r
1642 latest value of a symbolic variable to parameter. The sample from the previous\r
1643 section, where "irpv" was used with "break" to perform just one iteration on\r
1644 the last value, can be rewritten to use "match" instead:\r
1645 \r
1646         x = 1\r
1647         var equ x\r
1648         match symbol, var\r
1649                 symbol = 2\r
1650         end match\r
1651         assert x = 2\r
1652 \r
1653 The difference between them is that "irpv" would execute its block even for\r
1654 an empty value, while in the case of "match" the "else" block would need to be\r
1655 added to handle an empty text.\r
1656   When the evaluation of symbolic variables in the matched text is undesirable,\r
1657 a symbol created with "define" can be used as a proxy to preserve the text,\r
1658 because the replacement is not recursive:\r
1659 \r
1660         macro drop value\r
1661                 local temporary\r
1662                 define temporary value\r
1663                 match =A, temporary\r
1664                         db A\r
1665                         restore A\r
1666                 else\r
1667                         db value\r
1668                 end match\r
1669         end macro\r
1670 \r
1671         A equ 1\r
1672         A equ 2\r
1673 \r
1674         drop A\r
1675         drop A\r
1676 \r
1677 A concern could arise that "define" may modify the meaning of text by\r
1678 equipping it with a local context. But when the value for "define" comes from\r
1679 a parameter of macroinstruction (as in the above sample), it already carries\r
1680 its original context and "define" does not alter it.\r
1681   The "rawmatch" directive (with a synonym "rmatch") is very similar to "match",\r
1682 but it operates on the raw text of the second argument. Not only it does not\r
1683 evaluate the symbolic variables, but it also strips the text of any additional\r
1684 context it could have carried.\r
1685 \r
1686         struc has instruction\r
1687                 rawmatch text, instruction\r
1688                         namespace .\r
1689                                 text\r
1690                         end namespace\r
1691                 end rawmatch\r
1692         end struc\r
1693 \r
1694         define x\r
1695         x has a = 3\r
1696         assert x.a = 3\r
1697 \r
1698 In the above sample the identifier of "a" would be interpreted in the context\r
1699 effective for the line calling the "has" macroinstruction if it was not\r
1700 converted back into the raw text by "rmatch".\r
1701 \r
1702 \r
1703 13. Output areas\r
1704 \r
1705 The "org" instruction starts a new area of output. The content of such\r
1706 area is written into the destination file next to the previous data, but the\r
1707 addresses in the new area are based on the value specified in the argument to\r
1708 "org". The area is closed automatically when the next one is started or when\r
1709 the source ends.\r
1710 \r
1711         org 100h\r
1712         start:                  ; start = 100h\r
1713 \r
1714   The "$" is a built-in symbol of expression class which is always equal to\r
1715 the value of current address. Therefore definition of a constant with the value\r
1716 specified by "$" symbol is equivalent to defining a label at the same point:\r
1717 \r
1718         org 100h\r
1719         start = $               ; start = 100h\r
1720 \r
1721 The "$$" symbol is always equal to the base of current addressing space, so\r
1722 in the area started with "org" it has the same value as the base address from\r
1723 the argument of "org". The difference between "$" and "$$" is thus the current\r
1724 position relative to the start of the area:\r
1725 \r
1726         org 2000h\r
1727         db 'Hello!'\r
1728         size = $ - $$           ; size = 6\r
1729 \r
1730 The "$@" symbol evaluates to the base address of current block of uninitialized\r
1731 data. When there was no such data defined just before the current position, \r
1732 this value is equal to "$", otherwise it is equal to "$" minus the length of\r
1733 said data inside the current addressing space. Note that reserved data\r
1734 no longer counts as such when it is followed by an initialized one.\r
1735   The "section" instruction is similar to "org", but it additionally trims\r
1736 all the reserved data that precedes it analogously to how the uninitialized\r
1737 data is not written into output when it is at the end of file. The "section"\r
1738 can therefore be followed by initialized data definitions without causing\r
1739 the previously reserved data to be initialized with zeros and written into\r
1740 output. In this sample only the first of the three reserved buffers is\r
1741 actually converted into zeroed data and written into output, because it is\r
1742 followed by some initialized data. The second one is trimmed because of the\r
1743 "section", and the third one is cut off since it lies at the end of file:\r
1744 \r
1745         data1 dw 1\r
1746         buffer1 rb 10h          ; zeroed and present in the output\r
1747 \r
1748         org 400h\r
1749         data dw 2\r
1750         buffer2 rb 20h          ; not in the output\r
1751 \r
1752         section 1000h\r
1753         data3 dw 3\r
1754         buffer3 rb 30h          ; not in the output\r
1755 \r
1756   The "$%" is a built-in symbol equal to the offset within the output file at\r
1757 which the initialized data would be generated if it was defined at this point.\r
1758 The "$%%" symbol is the current offset within the output file. These two\r
1759 values differ only when they are used after some data has been reserved -\r
1760 the "$%" is then larger than "$%%" by the length of unitialized data which\r
1761 would be generated into output if it was to be followed by some initialized\r
1762 one.\r
1763 \r
1764         db 'Hello!'\r
1765         rb 4\r
1766         position = $%%          ; position = 6\r
1767         next = $%               ; next = 10\r
1768 \r
1769 The values in the comments of the above sample assume that the source contains\r
1770 no other instructions generating output.\r
1771   The "virtual" creates a special output area which is not written into the main\r
1772 output file. This kind of area must reside between the "virtual" and "end virtual"\r
1773 commands, and after it is closed, the output generator comes back to the area it\r
1774 was previously operating on, with position and address the same as there were just\r
1775 before opening the "virtual" block. This allows also to nest the "virtual" blocks\r
1776 within each other.\r
1777   When "virtual" has no argument, the base address of this area is the same\r
1778 as current address in the outer area. An argument to "virtual" can have a form\r
1779 of "at" keyword followed by an expression defining the base address for the\r
1780 enclosed area:\r
1781 \r
1782         int dw 1234h\r
1783         virtual at int\r
1784                 low db ?\r
1785                 high db ?\r
1786         end virtual\r
1787 \r
1788   Instead of or in addition to such argument, "virtual" can also be followed by\r
1789 an "as" keyword and a string defining an extension of additional file where\r
1790 the initialized content of the area is going to be stored at the end of\r
1791 a successful assembly.\r
1792   The "load" instruction defines the value of a variable by loading the string\r
1793 of bytes from the data generated in an output area. It should be followed by\r
1794 an identifier of symbol to define, then optionally the ":" character and a\r
1795 number of bytes to load, then the "from" keyword and an address of the data\r
1796 to load. This address can be specified in two modes. If it is simply a numeric\r
1797 expression, it is an address within the current area. In that case the loaded\r
1798 bytes must have already been generated, so it is only possible to load from the\r
1799 space between "$$" and "$" addresses.\r
1800 \r
1801         virtual at 100h\r
1802                 db 'abc'\r
1803                 load b:byte from 101h   ; b = 'b'\r
1804         end virtual\r
1805 \r
1806 When the number of bytes is not specified, the length of loaded string is\r
1807 determined by the size associated with address.\r
1808   Another variant of "load" needs a special kind of label, which is created\r
1809 with "::" instead of ":". Such label has a value that cannot be used directly,\r
1810 but it can be used with "load" instruction to access the data of the area in\r
1811 which this label has been defined. The address for "load" has then to be\r
1812 specified as the area label followed by ":" and then the address within that\r
1813 area:\r
1814 \r
1815         virtual at 0\r
1816                 hex_digits::\r
1817                 db '0123456789ABCDEF'\r
1818         end virtual\r
1819         load a:byte from hex_digits:10  ; a = 'A'\r
1820 \r
1821 This variant of "load" can access the data which is generated later, even\r
1822 within the current area:\r
1823 \r
1824         area::\r
1825         db 'abc'\r
1826         load sub:3 from area:$-2        ; sub = 'bcd'\r
1827         db 'def'\r
1828 \r
1829   The "store" instruction can modify already generated data in the output\r
1830 area. It should be followed by a value (automatically converted to string\r
1831 of bytes), then optionally the ":" character followed by a number of bytes\r
1832 to write (when this setting is not present, the length of string is determined\r
1833 by the size associated with address), then the "at" keyword and the address of\r
1834 data to replace, in one of the same two modes as allowed by "load". However the\r
1835 "store" is not allowed to modify the data that has not been generated yet, and\r
1836 any area that has been touched by "store" becomes a variable area, forbidding\r
1837 also the "load" to read a data from such area in advance.\r
1838   The following example uses the combination of "load" and "store" to encrypt\r
1839 the entire contents of the current area with a simple "xor" operation:\r
1840 \r
1841         db "Text"\r
1842         key = 7Bh\r
1843         repeat $-$$\r
1844                 load a : byte from $$+%-1\r
1845                 store a xor key : byte at $$+%-1\r
1846         end repeat\r
1847 \r
1848   If the final data of an area that has been modified by "store" needs to be\r
1849 read earlier in the source, it can be achieved by copying this data into\r
1850 a different area that would not be constrained in such way. This is analogous\r
1851 to defining a constant with a final value of some variable:\r
1852 \r
1853         load char : byte from const:0\r
1854 \r
1855         virtual\r
1856                 var::\r
1857                 db 'abc'\r
1858                 .length = $\r
1859         end virtual\r
1860 \r
1861         store 'A' : byte at var:0\r
1862 \r
1863         virtual\r
1864                 const::\r
1865                 repeat var.length\r
1866                         load a : byte from var:%-1\r
1867                         db a\r
1868                 end repeat\r
1869         end virtual\r
1870 \r
1871   The area label can be forward-referenced by "load", but it can never be\r
1872 forward-referenced by "store", even if it refers to the current output area.\r
1873   The "virtual" instruction can have an existing area label as the only\r
1874 argument. This variant allows to extend a previously defined and closed\r
1875 block with additional data. The area label must refer to a block that was\r
1876 created earlier in the source with "virtual". Any definition of data within\r
1877 an extending block is going to have the same effect as if that definition was\r
1878 present in the original "virtual" block.\r
1879 \r
1880         virtual at 0 as 'log'\r
1881                 Log::\r
1882         end virtual\r
1883 \r
1884         virtual Log\r
1885                 db 'Hello!',13,10\r
1886         end virtual\r
1887 \r
1888   If an area label is used in an expression, it forms a variable term of a\r
1889 linear polynomial. The metadata of such term is a string "::", allowing\r
1890 to determine that area label was used to form the value, since metadata\r
1891 of terms made with "element" is always numeric.\r
1892   There is an additional variant of "load" and "store" directives that allows\r
1893 to read and modify already generated data in the output file given simply\r
1894 an offset within that output. This variant is recognized when the "at" or\r
1895 "from" keyword is followed by ":" character and then the value of an offset.\r
1896 \r
1897         checksum = 0\r
1898         repeat $%\r
1899                 load a : byte from : %-1\r
1900                 checksum = checksum + a\r
1901         end repeat\r
1902 \r
1903   The "restartout" instruction abandons all the output generated up to this\r
1904 point and starts anew with an empty one. An optional argument may specify\r
1905 the base address of newly started output area. When "restartout" has no\r
1906 argument, the current address is preserved by using it as the base for the\r
1907 new area.\r
1908   The "org", "section" and "restartout" instructions cannot be used inside\r
1909 a "virtual" block, they can only separate areas that go into the output file.\r
1910 \r
1911 \r
1912 14. Source and output control\r
1913 \r
1914 The "include" instruction reads the source text from another file and\r
1915 processes it before proceeding further in the current source. Its argument\r
1916 should be a string defining the path to a file (the format of the path may\r
1917 depend on the operating system). If there is a "!" between the instruction\r
1918 and the argument, the other file is read and processed unconditionally,\r
1919 even when it is inside a skipped block (the unconditional instructions from\r
1920 the other file may then get recognized).\r
1921 \r
1922         include 'macro.inc'\r
1923 \r
1924 An additional argument may be optionally added (separated from the path\r
1925 by comma), and it is interpreted as a command to be executed after the file\r
1926 has been read and inserted into the source stream, just before processing\r
1927 the first line.\r
1928   The "eval" instruction takes a sequence of bytes defined by its arguments,\r
1929 treats it as a source text and assembles it. The arguments are either strings\r
1930 or the numeric values of single bytes, separated with commas. In the next\r
1931 example "eval" is used to generate definitions of symbols named as a\r
1932 consecutive letters of the alphabet:\r
1933 \r
1934         repeat 26\r
1935                 eval 'A'+%-1,'=',`%\r
1936         end repeat\r
1937 \r
1938         assert B = 2\r
1939 \r
1940   The "display" instruction causes a sequence of bytes to be written into\r
1941 standard output, next to the messages generated by the assembler. It should\r
1942 be followed by strings or numeric values of single bytes, separated\r
1943 with commas. The following example uses "repeat 1" to define a parameter\r
1944 with a decimal representation of computed number, and then displays it as\r
1945 a string:\r
1946 \r
1947         macro show description,value\r
1948                 repeat 1, d:value\r
1949                         display description,`d,13,10\r
1950                 end repeat\r
1951         end macro\r
1952 \r
1953         show '2^64=',1 shl 64\r
1954 \r
1955   The "err" instruction signalizes an error in the assembly process, with\r
1956 a custom message specified by its argument. It allows the same kind of\r
1957 arguments as the "display" directive.\r
1958 \r
1959         if $>10000h\r
1960                 err 'segment too large'\r
1961         end if\r
1962 \r
1963   The "format" directive allows to set up additional options concerning\r
1964 the main output. Currently the only available choice is "format binary" followed\r
1965 by the "as" keyword and a string defining an extension for the output file.\r
1966 Unless a name of the output file is specified from the command line, it is\r
1967 constructed from the path to the main source file by dropping the extension and\r
1968 attaching a new extension if such is defined.\r
1969 \r
1970         format binary as 'com'\r
1971 \r
1972   The "format" directive, analogously to "end", uses an identifier that follows\r
1973 it to find an instruction in the child namespace of case-insensitive symbol\r
1974 named "format". The only built-in instruction that resides in that namespace\r
1975 is the "binary", but additional ones may be defined in form of macroinstructions.\r
1976   The built-in symbol "__time__" (with legacy synonym "%t") has the constant value\r
1977 of the timestamp marking the point in time when the assembly was started.\r
1978   The "__file__" is a built-in symbol whose value is a string containing\r
1979 the name of currently processed source file. The accompanying "__line__" symbol\r
1980 provides the number of currently processed line in that file. When these symbols\r
1981 are accessed within a macroinstruction, they keep the same value they had for the\r
1982 calling line. If there are several levels of macroinstructions calling each \r
1983 other, these symbols have the same value everywhere, corresponding to the line\r
1984 that called the outermost macroinstruction.\r
1985   The "__source__" is another built-in symbol, with value being a string containing\r
1986 the name of the main source file.\r
1987   The "retaincomments" directive switches the assembler to treat a semicolon as\r
1988 a regular token and therefore not strip comments from lines before processing.\r
1989 This allows to use semicolons in places like MATCH pattern.\r
1990 \r
1991         retaincomments\r
1992         macro ? line&\r
1993                 match instruction ; comment , line\r
1994                         virtual\r
1995                                 comment\r
1996                         end virtual\r
1997                         instruction\r
1998                 else\r
1999                         line\r
2000                 end match\r
2001         end macro\r
2002 \r
2003         var dd ?  ; bvar db ?\r
2004 \r
2005   The "isolatelines" directive prevents the assembler from subsequently combining\r
2006 lines read from the source text when the line break is preceded by a backslash.\r
2007   The "removecomments" directive brings back the default behavior of semicolons\r
2008 and the "combinelines" directive allows lines from the source text to be combined\r
2009 as usual.\r
2010 \r
2011 \r
2012 15. CALM instructions\r
2013 \r
2014 The "calminstruction" directive allows to define new instructions in form of\r
2015 compiled sequences of specialized commands. As opposed to regular macroinstructions,\r
2016 which operate on a straightforward principle of textual substitution, CALM \r
2017 (Compiled Assembly-Like Macro) instructions are able to perform many operations\r
2018 without passing any text through the standard preprocessing and assembly cycle. \r
2019 This allows for a finer control, better error handling and faster execution.\r
2020   All references to symbols in the text defining a CALM instruction are fixed\r
2021 at the time of definition. As a consequence, any symbols local to the CALM instruction\r
2022 are shared among all its executed instances (for example consecutive instances may see\r
2023 the values of local symbols left by the previous ones). To aid in reusing these\r
2024 references, commands in CALM are generally operating on variables, routinely rewriting\r
2025 the symbols with new values.\r
2026   A "calminstruction" statement follows the same rules as "macro" declaration,\r
2027 including options like "!" modifier to define unconditional instruction, "*" to mark\r
2028 a required argument, ":" to give it a default value and "&" to indicate that\r
2029 the final argument consumes all the remaining text in line.\r
2030   However, because CALM instruction operates outside of the standard preprocessing\r
2031 and assembly cycle, its arguments do not become preprocessed parameters. Instead\r
2032 they are local symbolic variables, given new values every time the instruction is called.\r
2033   If the name of defined instruction is preceded by another name enclosed in round\r
2034 brackets, the statement defines a labeled instruction and enclosed name is the\r
2035 argument that is going to receive the text of the label.\r
2036   In the definition of CALM instruction, only the statements of its specialized\r
2037 language are identified. The initial symbol of every line must be a simple name without\r
2038 modifiers and it is only recognized as valid instruction if a case-insensitive symbol with\r
2039 such name is found in the namespace of CALM commands (which, for the purpose\r
2040 of customization, is accessible as the namespace anchored at the case-insensitive\r
2041 "calminstruction" symbol). When no such named instruction is found, the initial name may\r
2042 become a label if it is followed by ":", it is then treated as a case-sensitive symbol\r
2043 belonging to a specialized class. Symbols of this class are only recognized when used\r
2044 as arguments to CALM jump commands (described further down).\r
2045   An "end calminstruction" statement needs to be used to close the definition and\r
2046 bring back normal mode of assembly. It is not a regular "end" command, \r
2047 but an identically named instruction in the CALM namespace, which only accepts\r
2048 "calminstruction" as its argument.\r
2049   The "assemble" is a command that takes a single argument, which should be\r
2050 an identifier of a symbolic variable. The text of this variable is passed directly\r
2051 to assembly, without any preprocessing (if the text came from an argument to\r
2052 the instruction, it already went through preprocessing when that line was prepared).\r
2053 \r
2054         calminstruction please? cmd&\r
2055                 assemble cmd\r
2056         end calminstruction\r
2057 \r
2058         please display 'Hi!'\r
2059 \r
2060   The "match" command is in many ways similar to the standard directive with the same\r
2061 name. Its first argument should be a pattern following the same rules as those for\r
2062 "match" directive. The second argument must be an identifier of a symbolic variable,\r
2063 whose text is going to be matched against the pattern. The name tokens in pattern\r
2064 (except for the ones made literal with "=" symbol) are treated as names of variables\r
2065 where the matched portions of text should be put if the match is successful. The same\r
2066 variable that is a source of text can also be used in pattern as a variable\r
2067 to write to. When there is no match, all variables remain unaffected.\r
2068 \r
2069         calminstruction please? cmd&\r
2070                 match (cmd), cmd\r
2071                 assemble cmd\r
2072         end calminstruction\r
2073 \r
2074         please(display 'Hi!')\r
2075 \r
2076   Whether the match was successful can also be tested with a conditional jump "jyes"\r
2077 or "jno" following the "match" command. A "jyes" jump is taken only when the match\r
2078 succeeded.\r
2079 \r
2080         calminstruction please? cmd&\r
2081                 match =do? =not? cmd, cmd\r
2082                 jyes done\r
2083                 assemble cmd\r
2084             done:\r
2085         end calminstruction\r
2086 \r
2087         please do not display 'Bye!'\r
2088 \r
2089 To further control the flow of processing, the "jump" command allows to jump\r
2090 unconditionally, and with "exit" it is possible to terminate processing of\r
2091 CALM instruction at any moment (this command takes no arguments). \r
2092   While the symbols used for the arguments of the instruction are implicitly local,\r
2093 other identifiers may become fixed references to global symbols if they are seen\r
2094 as accessible at the time of definition (because in CALM instruction all such references\r
2095 are treated as uses, not as definitions). A command like "match" may then write to\r
2096 a global variable.\r
2097 \r
2098         define comment\r
2099 \r
2100         calminstruction please? cmd&\r
2101                 match cmd //comment, cmd\r
2102                 assemble cmd\r
2103         end calminstruction\r
2104 \r
2105         please display 'Hi!' // 3\r
2106         db comment                      ; db 3 \r
2107 \r
2108 To enforce treatment of a symbol as local, a "local" command should be used, followed\r
2109 by one or more names separated with commas.\r
2110 \r
2111         calminstruction please? cmd&\r
2112                 local comment\r
2113                 match cmd //comment, cmd\r
2114                 assemble cmd\r
2115         end calminstruction\r
2116 \r
2117 A symbol made local is initally assigned a defined but unusable value.\r
2118   If a pattern in CALM instruction has a "?" character immediately following the name\r
2119 of a wildcard, it does not affect how the symbol is identified (whether the used symbol\r
2120 is case-insensitive depends on what is present in the local scope at the time\r
2121 the instruction is defined). Instead, modifying the name of a wildcard with "?" allows it\r
2122 to be matched with an empty text.\r
2123   Since the source text for "match" is in this variant given by just a single identifier,\r
2124 this syntax allows to have more optional arguments. A third argument to "match" may\r
2125 contain a pair of bracket characters. Any wildcard element must then be matched with\r
2126 a text that has this kind of brackets properly balanced.\r
2127 \r
2128         calminstruction please? cmd&\r
2129                 local first, second\r
2130                 match first + second, cmd, ()\r
2131                 jyes split\r
2132                 assemble cmd\r
2133                 exit\r
2134             split:\r
2135                 assemble first\r
2136                 assemble second\r
2137         end calminstruction\r
2138 \r
2139         please display 'H',('g'+2) + display '!'\r
2140 \r
2141 The brackets selected by the third argument must not be used anywhere in the pattern.\r
2142   The "arrange" command is like an inverse of "match", it can build up a text\r
2143 containing the values of one or more variables. The first argument defines a variable\r
2144 where the constructed text is going to be stored, while the second argument is\r
2145 a pattern formed in the same way as for "match" (except that it does not need\r
2146 to precede a comma with "=" to have it included in the argument).\r
2147 All non-name tokens other than "=" and tokens preceded with "=" are copied literally\r
2148 into the constructed text and they do not carry any recognition context with them.\r
2149 The name tokens that are not made literal with "=" are treates as names of variables\r
2150 whose symbolic values are put in their place into the constructed text.\r
2151 \r
2152         calminstruction addr? arg\r
2153                 local base, index\r
2154                 match base[index], arg\r
2155                 local cmd\r
2156                 arrange cmd, =dd base + index\r
2157                 assemble cmd\r
2158         end calminstruction\r
2159 \r
2160         addr 8[5]                       ; dd 8 + 5\r
2161 \r
2162 With suitably selected patterns, "arrange" can be used to copy symbolic value\r
2163 from one variable to another or to assign it a fixed value (even an empty one).\r
2164   If a variable used in pattern turns out to have a numeric value instead of symbolic,\r
2165 as long as it is a non-negative number with no additional terms, it is converted\r
2166 into a decimal token stored into the constructed symbolic value (an operation\r
2167 that outside of CALM instructions would require use of a "repeat 1" trick):\r
2168 \r
2169         digit = 4 - 1\r
2170 \r
2171         calminstruction demo\r
2172                 local cmd\r
2173                 arrange cmd, =display digit#0h\r
2174                 assemble cmd\r
2175         end calminstruction\r
2176 \r
2177         demo                            ; display 3#0h\r
2178 \r
2179   The "compute" command allows to evaluate expressions and assign numeric results to\r
2180 variables. The first argument to "compute" defines a target where the result should\r
2181 be stored, while the second argument can be any numeric expression, which is\r
2182 becomes pre-compiled at the time of definition. When the expression is evaluated\r
2183 and any of the symbols it refers to turns out to have symbolic value, this text\r
2184 is parsed as a new sub-expression, and its calculated value is then used in the\r
2185 computation of the main expression.\r
2186   A "compute" therefore can be used not only to evaluate a pre-defined expression,\r
2187 but also to parse and compute an expression from a text of a symbolic variable\r
2188 (like one coming from an argument to the instruction), or a combination of both:\r
2189 \r
2190         a = 0\r
2191 \r
2192         calminstruction low expr*\r
2193                 compute a, expr and 0FFh\r
2194         end calminstruction\r
2195 \r
2196         low 200 + 73                    ; a = 11h\r
2197 \r
2198 Because symbolic variable is evaluated as a sub-expression, its use here has no\r
2199 side-effects that would be caused by a straightforward text substitution.\r
2200   The "check" command is analogous to "if". It evaluates a condition defined by\r
2201 the logical expression that follows it and accordingly sets up the result flag which\r
2202 may be tested with "jyes" or "jno" command. The values of symbolic variables\r
2203 are treated as numeric sub-expressions (they may not contain any operators specific\r
2204 to logical expression).\r
2205 \r
2206         calminstruction u8range? value\r
2207                 check value >= 0 & value < 256\r
2208                 jyes ok\r
2209                 local cmd\r
2210                 arrange cmd, =err 'value out of range'\r
2211                 assemble cmd\r
2212             ok:\r
2213         end calminstruction\r
2214 \r
2215         u8range -1\r
2216   \r
2217   The "publish" command allows to assign a value to a symbol identified by the text\r
2218 held in a variable. This allows to define a symbol with a name constructed with\r
2219 a command like "arrange", or a name that was passed in an argument to an instruction.\r
2220 The first argument needs to be the symbolic variable containing the identifier\r
2221 of the symbol to define, the second argument should be the variable holding\r
2222 the value to assign (either symbolic or numeric). The first argument may be\r
2223 followed by ":" character to indicate that the symbol should be made constant, \r
2224 or it can be preceded by ":" to make the value stacked on top of the previous one\r
2225 (so that the previous one can be brought back with "restore" directive).\r
2226 \r
2227         calminstruction constdefine? var\r
2228                 local val\r
2229                 arrange val,\r
2230                 match var=  val, var\r
2231                 publish var:, val\r
2232         end calminstruction\r
2233 \r
2234         constdefine plus? +\r
2235 \r
2236 The above instruction allows to define a symbolic constant, something that is not\r
2237 possible with standard directives of the assembler.\r
2238   The purpose of "transform" command is to replace identifiers of symbolic variables\r
2239 (or constants) with their values in a given text, which is the same operation as done\r
2240 by "equ" directive when it prepares the value to assign. The argument to "transform"\r
2241 should be a symbolic variable whose value is going to be processed this way and then\r
2242 replaced by the transformed text.\r
2243 \r
2244         calminstruction (var) constequ? val\r
2245                 transform val\r
2246                 publish var:, val\r
2247         end calminstruction\r
2248 \r
2249 A "transform" command updates the result flag to indicate whether any replacement\r
2250 has been done.\r
2251 \r
2252         calminstruction prepasm? cmd&\r
2253             loop:\r
2254                 transform cmd\r
2255                 jyes loop               ; warning: may hang on cyclic references\r
2256                 assemble cmd\r
2257         end calminstruction\r
2258 \r
2259 The result flag is modified only by some of the commands, like "check", "match" \r
2260 or "transform". Other commands keep it unchanged.\r
2261   Optionally, "transform" can have two arguments, with second one specifying\r
2262 a namespace. Identifiers in the text given by the first argument are then interpreted\r
2263 as symbols in this namespace regardless of their original context.\r
2264   The "stringify" is a command that converts text of a variable into a string\r
2265 and writes it into the same variable (specified by the only argument). This operation\r
2266 is similar to one performed by "`" operator in preprocessing.\r
2267 \r
2268         calminstruction (var) strcalc? val\r
2269                 compute val, val        ; compute expression\r
2270                 arrange val, val        ; convert result to a decimal token\r
2271                 stringify val           ; convert decimal token to string\r
2272                 publish var, val\r
2273         end calminstruction\r
2274                      \r
2275         p strcalc 1 shl 1000\r
2276         display p\r
2277 \r
2278   While most commands available to CALM instructions replace the values of variables\r
2279 when writing to them, the "take" is a command that allows to work with stacks of values.\r
2280 It removes the topmost value of the source symbol (specified by the second argument)\r
2281 and gives it to the destination symbol (the first argument), placing it on top of any\r
2282 existing values. The destination argument may be empty, in such case the value is\r
2283 removed completely and the operation is analogous to "restore" directive. This command\r
2284 updates the result flag to indicate whether there was any value to remove.\r
2285 If the destination symbol is the same as source, the result flag can be used to check\r
2286 whether there is an available value without affecting it.\r
2287 \r
2288         calminstruction reverse? cmd&\r
2289                 local tmp, stack\r
2290             collect:\r
2291                 match tmp=,cmd, cmd\r
2292                 take stack, tmp\r
2293                 jyes collect\r
2294             execute:\r
2295                 assemble cmd\r
2296                 take cmd, stack\r
2297                 jyes execute\r
2298         end calminstruction\r
2299 \r
2300         reverse display '!', display 'i', display 'H' \r
2301 \r
2302 A symbol accessed as either destination or source by a "take" command can never be\r
2303 forward-referenced even if it could otherwise.\r
2304   Defining macroinstructions in the namespace of case-insensitive "calminstruction"\r
2305 allows to add customized commands to the language of CALM instructions. However, \r
2306 they must be defined as case-insensitive to be recognized as such.\r
2307 \r
2308         macro calminstruction?.asmarranged? variable*, pattern&\r
2309                 arrange variable, pattern\r
2310                 assemble variable\r
2311         end macro\r
2312 \r
2313         calminstruction writeln? text&\r
2314                 asmarranged text, =display text,10\r
2315         end calminstruction\r
2316 \r
2317         writeln 'Next!'\r
2318 \r
2319 Such additional commands may even be defined as CALM instructions themselves:\r
2320 \r
2321         calminstruction calminstruction?.initsym? variable*,value&\r
2322                 publish variable, value\r
2323         end calminstruction\r
2324 \r
2325         calminstruction show? text&\r
2326                 local command\r
2327                 initsym command, display text\r
2328                 stringify text\r
2329                 assemble command\r
2330         end calminstruction\r
2331 \r
2332         show :)\r
2333 \r
2334 The command "initsym" in this example is used to assign text to the local\r
2335 symbolic variable at the time when "show" instruction is defined.\r
2336 Similarly to "local" (and unlike "stringify" and "assemble") it does not produce\r
2337 any actual code that would be executed when the "show" instruction is called.\r
2338 The arguments to "initsym" retain their original context, therefore symbols\r
2339 in the text assigned to the "command" variable are interpreted as in the local\r
2340 namespace of the "show" instruction. This allows the "display" command to access\r
2341 the "text" even though it is local to the CALM instruction and therefore normally\r
2342 visible only in the scope of the definition of "show". This is similar to the use\r
2343 of "define" to form symbolic links.\r