3 { I'm a bit ashamed but I really got into Forth quite recently, it's possible I spread some misinformation here, please let me know if I do, thanks <3 ~drummyfish }
5 Forth ("fourth generation" shortened to four characters due to technical limitations) is a very [elegant](beauty.md), extremely [minimalist](minimalism.md) [stack](stack.md)-based, untyped [programming language](programming_language.md) (and a general computing environment) that uses [postfix](notation.md) (reverse Polish) notation -- it is one of the very best programming languages ever conceived. Forth's vanilla form is super simple, much simpler than [C](c.md), its design is ingenious and a compiler/interpreter can be made with relatively little effort, giving it high [practical freedom](freedom_distance.md) (that is to say Forth can really be in the hands of the people). As of writing this the smallest Forth implementation, [milliforth](milliforth.md), has just **340 bytes** (!!!) of [machine code](machine_code.md), that's just incredible (the size is very close to [Brainfuck](brainfuck.md)'s compiler size, a language whose primary purpose was to have the smallest compiler possible). Forth finds use for example in [space](space.md) computers (e.g. [RTX2010](rtx2010.md), a radiation hardened space computer directly executing Forth) and [embedded](embedded.md) systems as a way to write efficient [low level](low_level.md) programs that are, unlike those written in [assembly](assembly.md), [portable](portability.md). Forth stood as the main influence for [Comun](comun.md), the [LRS](lrs.md) programming language, it is also used by [Collapse OS](collapseos.md) and [Dusk OS](duskos.md) as the main language. In minimalism Forth competes a bit with [Lisp](lisp.md), however, to Lisp fan's dismay, Forth seems to ultimately come out as superior, especially in performance, but ultimately probably even in its elegance (while Lisp may be more mathematically elegant, Forth appears to be the most elegant fit for real hardware).
7 Not wanting to invoke a fanboy mentality, the truth still has to be left known that **Forth may be one of best [programming](programming.md) systems yet conceived**, it is a pinnacle of programming genius. While in the realm of "normal" programming languages we're used to suffering tradeoffs such as sacrificing performance for flexibility, Forth dodges this seemingly inevitable mathematical curse and manages to beat virtually all such traditional languages at EVERYTHING at once: [simplicity](minimalism.md), [beauty](beauty.md), memory compactness, flexibility, performance and [portability](portability.md). It's also much more than a programming language, it is an overall system for computing, a calculator, programming language and its own debugger but may also serve for example as a [text editor](text_editor.md) and even, without exaggeration, a whole [operating system](os.md) (that is why e.g. DuskOS is written in Forth -- it is not as much written in Forth as it actually IS Forth). Understandably you may ask: if it's so great, why isn't it very much used "in the business"? Once someone summed it up as follows: Forth gives us unprecedented freedom and that allows [retards](soydev.md) to come up with bad design and unleash destruction -- [capitalism](capitalism.md) needs languages for monkeys, that's why [bad languages](rust.md) prosper. Remember: popularity has never been a measure of quality -- the best art will never be mainstream, it can only be understood and mastered by a few.
9 Forth is unique in its philosophy, we might almost go as far as calling Forth a programming [paradigm](paradigm.md) of its own. It can really be hardly compared to traditional languages such as [C++](cpp.md) or [Java](java.md) -- while the "typical language" is always more or less the same thing from the programmer's point of view by providing a few predefined, hardwired, usually complex but universal constructs that are simply there and cannot be changed in any way (such as an [OOP](oop.md) system, template system, macro language, control structures, primitive types, ...), **Forth adopts [Unix philosophy](unix_philosophy.md)** (and dare we say probably better than Unix itself) by defining just the concept of a *word*, maybe providing a handful of simple words for the start, and then letting the programmer extend the language (that is even the compiler/interpreter itself) by creating new words out of the simpler ones, and this includes even things such as control structures (branches, loops, ...), variables and constant. For instance: in traditional languages we find a few predefined formats in which numbers may be written -- let's say C lets us use decimal numbers as `123` or hexadecimal numbers as `0x7b` -- in Forth you may change the base at any time to any value by assigning to the `base` variable which will change how Forth parses and outputs numbers (while a number is considered any word that's not been found in dictionary), and it is even possible to completely rewrite the number parsing procedure itself. Almost everything in Forth can be modified this way, so pure Forth without any words is not much more than a description of a [data structure](data_structure.md) and simpler parser of space-separated words, it plainly dictates a format of how words will be represented and handled on a very basic level (that's on the simplicity level of, let's say, [lambda calculus](lambda_calculus.md)) and only a *Forth system* (i.e. one with a specific dictionary of defined words, such as that defined by ANS Forth standard) provides a basic "practically usable" language. The point is this can still be extended yet further, without any end or limitation.
11 { Since Forth adopts a kind of unique philosophy, there are some discussion about how low level Forth really is, if it really is a language or something like a "metalanguage", or an "environment" to create your own language by defining your own words. Now this is not a place to go very deep on this but kind of a sum up may be this: Forth in its base version is very low level, however it's very extensible and many Forth systems extend the base language to some kind of much higher level language, hence the debates. ~drummyfish }
13 Being somewhat of a misfit in terms of classification, the language is probably more often presented as [interpreted](interpreter.md), but that's a tiny bit misleading (interpreting Forth is almost like native execution), however it may perfectly well be [compiled](compiler.md) to pure machine code too; it's actually very easy and natural to turn Forth source code into assembly, however (again, due to Forth's unique nature) it is not so easy to state with confidence whether the language is really interpreted or compiled because interpreting Forth happens on such a low level that it's almost native code execution -- any newly defined word is immediately compiled into a list of addresses of other words (i.e. in C terms function pointers) and the most basic words are typically written directly in [machine code](machine_code.md), so the interpreter doesn't perform any search for word names or anything like that (like a typical scripting language would), it just jumps between memory addresses, pushes numbers on stack and sometimes runs a native piece of code. For this Forth may be seen as a kind of "wrapper for assembly" as well, one that helps it be [portable](portability.md) (to port a program one will just have to replace the machine code of the basic words).
15 Forth systems traditionally include not just a compiler/interpreter but also an **interactive environment** in which one is defining and compiling new words on the go (by this it's similar to [Lisps](lisp.md) that are usually interactive too). Again -- this is not just some kind of extra killer feature, an interactive environment naturally comes as a byproduct of Forth's design, it costs nothing to have such environment. This environment can serve for example as a debugger or even an operating system.
17 There are several Forth standards, most notably ANS Forth from 1994 (the document is [proprietary](proprietary.md), sharing is allowed, 640 kB as txt). Besides others it also allows Forth to include optional [floating point](float.md) support, however Forth programmers highly prefer [fixed point](fixed_point.md) (as stated in the book *Starting Forth*). Then there is a newer Forth 2012 standard, but it's probably better to stick with the older one.
19 A [free](free_software.md) Forth implementation is e.g. GNU Forth ([gforth](gforth.md)) or [pforth](pforth.md) (a possibly better option by LRS standards, favors [portability](portability.md) over performance).
21 There is a book called **Starting Forth** that's freely downloadable and quite good at teaching the language.
23 { There used to be a nice Forth wiki at wiki.forthfreak.net, now it has to be accessed via archive as it's dead. Also some nice site here: https://www.taygeta.com/forth/dpans.html. ~drummyfish }
25 Forth was invented by [Charles Moore](charles_moore.md) (NOT the one of the [Moore's Law](moores_law.md) though) in 1968, for programming radio telescopes.
29 Forth is usually case-insensitive.
31 The language operates on an evaluation **[stack](stack.md)** with postfix notation: for example the operation + takes the two values at the top of the stack, adds them together and pushed the result back on the stack (i.e. for example `1 2 +` in Forth is the same as `1 + 2` in C). Besides this there are also some "advanced" features like variables living outside the stack, if you want to use them.
33 In fact there are two global stacks in Forth: the **parameter stack** (also data stack) and **return stack**. Parameter stack is the "normal" stack on which we do most computations and on which we pass parameters and return values. Returns stack is the stack on which return addresses from functions are stored (remember that this is needed e.g. for [recursion](recursion.md)), BUT it is also used as a temporary stack so that we can let's say put aside a few values to dive deeper on the main stack, however this has to be done carefully -- before end of word ("function") is reached, the return stack must be restored to the original state of course.
35 The stack is composed of **cells**: the size of the cell is implementation defined but must have at least 16 bits. The values stored in cells are just binary, they don't have any data type, so whether a value in given cell is considered signed or unsigned is up to the programmer -- some operators treat numbers as signed and some as unsigned (just like in [comun](comun.md) and [assembly](assembly.md) languages); note that with many operators the distinction doesn't matter (e.g. addition doesn't care if the numbers are signed or not, but comparison does). Forth programmers also often work with double numbers, i.e. numbers that take two cells (and so have double the range of the normal number) -- the words that work with these are prefixed with *2* (e.g. *2+*).
37 Basic [abstraction](abstraction.md) in Forth is so called **word**: a word is simply a string without spaces like `abc` or `1mm#3`. A word represents some action, which may include running native code, pushing numbers on stack or executing other words, for example the word `+` performs addition on top of the stack, `dup` duplicates the top of the stack etc. The programmer can define his own words -- so words are basically kind of "[functions](function.md)" or rather procedures or routines (however words don't return anything or take any arguments in traditional way, they all just invoke some operations -- arguments and return values are passed using the stack). Defining new words expands the current **dictionary**, so Forth basically extends itself as it's running. Part of Forth philosophy is to try define many small words rather than writing big walls of code. A word is defined like this:
40 : myword operation1 operation2 ... ;
43 For example a word that computes and average of the two values on top of the stack can be defined as:
49 Note that even the `:` and `;` characters that serve to define new words are words themselves.
51 Dictionary constitutes one of the most important concept in Forth, it usually stores the words as a [linked list](list.md), starting with the oldest word -- this allows for example temporary shadowing of previously defined words with the same name.
53 Forth programmers utilize what's called a **stack notation** to document the "prototype" of a function, i.e. what it does with the stack (this is important since the language doesn't have the traditional system of named, counted and checked function parameters) -- they write this notation in a comment above a defined word to communicate to others what the word will do. Stack notation has the format `( before -- after )`, for example the effect of the above defined `average` words would be written as `( a b -- avg )` in this notation.
55 Some predefined words usually present in Forth systems include:
60 + add ( a b -- [a+b] )
61 - subtract ( a b -- [a-b] )
62 * multiply ( a b -- [a*b] )
63 / divide ( a b -- [a/b] )
64 = equals ( a b -- [-1 if a = b else 0] )
65 <> not equals ( a b -- [-1 if a != b else 0] )
66 < less than (signed) ( a b -- [-1 if a < b else 0] )
67 > greater than (signed) ( a b -- [-1 if a > b else 0] )
68 u< less than (unsigned) ( a b -- [-1 if a u< b else 0] )
69 u> greater than (unsigned) ( a b -- [-1 if a u> b else 0] )
70 0= equals zero ( a -- [-1 if a = 0 else 0] )
71 and bitwise and ( a b -- [a&b] )
72 or bitwise or ( a b -- [a|b] )
73 mod modulo ( a b -- [a % b] )
74 dup duplicate ( a -- a a )
75 drop pop stack top ( a -- )
76 swap swap items ( a b -- b a )
77 rot rotate 3 ( a b c -- b c a )
78 pick push Nth item ( xN ... x0 N -- ... x0 xN )
79 . pop & print number as signed
80 u. pop & print number as unsigned
83 emit pop & print top as char
85 cells times cell width ( a -- [a * cell width in bytes] )
86 depth gets stack size ( a ... -- [previous stack size] )
87 quit don't print "ok" at the end of execution
92 >r pops value, pushed it to return stack
93 r> pops value from return stack, pushes it
94 r@ pushes value from return stack (doesn't pop it)
95 i pushes value from return stack (without pop)
96 i' pushes second value from return stack (without pop)
97 j pushes third value from return stack (without pop)
101 variable X creates var named X (X will be a word that pushed its addr.), allocates 1 cell
102 create X assigns X address (without allocating memory)
103 N X ! stores value N to variable X
104 N X +! adds value N to variable X
105 X @ pushes value of variable X to stack
106 N constant C creates constant C with value N (C will be a new word)
107 C pushes the value of constant C
112 \ comment (until newline)
113 ." S" print string S (compiles in the string)
114 " S" create string S (don't print, pushes pointer and length)
115 type print string (expects pointer and length)
116 X if C then if X, execute C (only in word def., X is popped)
117 X if C1 else C2 then if X, execute C1 else C2 (only in word def.)
118 do C loop loops from stack top value to stack second from,
119 top, special word "i" will hold the iteration val.
120 begin C until like do/loop but keeps looping as long as top = 0
121 begin C while like begin/until but loops as long as top != 0
122 begin C again infinite loop
123 begin C1 while C2 repeat loop with middle condition
124 leave loop break (only for counted loops)
125 N allot allocates N bytes of memory (moves end-of-mem ptr), e.g. for arrays
126 here returns current end-of-mem address ("H" pointer)
127 exit exits from current word
128 recurse recursively call the word currently being defined
129 see W shows (decompiles) the definition of word W
130 ' W get address of word W
131 MARKER W creates word W, executing W will delete W and all later words
134 Forth uses counted **strings** (unlike [C](c.md) which uses NULL terminated strings), i.e. a string consists of an address pointing to the string start, and number saying the length of the string.
136 TODO: local variables, addresses, arrays, compile-time behavior of words, strings, double words, format of the word in memory
140 These are some tiny example programs:
143 100 1 2 + 7 * / . \ computes and prints 100 / ((1 + 2) * 7)
147 cr ." hey bitch" cr \ prints: hey bitch
151 : myloop 5 0 do i . loop ; myloop \ prints 0 1 2 3 4
154 And here is our standardized **[divisor tree](divisor_tree.md)** program written in Forth:
157 \ takes x, pops it and recursively prints its divisor tree
160 0 swap 1 swap \ stack now: 0 1 x
162 >r 0 1 r> \ stack now: a b x
164 dup 2 / 1 + 2 do \ find the closest divisors (a, b)
165 dup i mod 0 = if \ i divides x?
166 2 pick 2 pick < if \ a < b?
169 >r \ use return stack for tmp storage
181 2 pick 0 <> if \ divisors found?
195 dup dup 48 >= swap 57 <= and if
203 begin \ main loop, read numbers from user
210 dup 13 <> while \ newline?
238 Source code files usually have `.fs` extension. We can use mentioned gforth to run our files. Let's create file `my.fs`; in it we write: { Hope the code is OK, I never actually programmed in Forth before. ~drummyfish }
254 We can run this simply with `gforth my.fs`, the programs should write `120`.
256 ## A Bit More Details
260 The first, immediate glance of elegance of Forth lies in the stack paradigm -- we don't need any brackets in expressions, no operator precedence, there is no distinction between operators and procedures and we don't need a complex expression parser. It's not hard to see the beauty of it, but Forth is not the only stack-based language.
262 The true, deeper genius of Forth is in the "everything is word" abstraction and how it allows a very elegant implementation, but this is more difficult to see, this resides under the hood -- to appreciate Forth one has to study the internal working and see how it all ultimately ties together. So let's start here with some very basic overview of the internals.
264 There are several regions of memory, most importantly the parameter stack (the main kind of stack), the return stack and dictionary memory. Dictionary obviously stores the words. **Format of the word in memory** may differ between implementations, but typically a word record has the following fields:
266 - *flags*: Flags specifying the type of word (some words may be "special", e.g. those that have compile time behavior). Valid words have the highest bit also set to 1; 0 here means end of the dictionary (terminating the linked list).
267 - *name length*: Length of the word's name, e.g. 6 for "myword". Some systems limit the name length, there may be a fixed size for the name (even as few as 3) and this field may be omitted. This field may also be merged into a single byte with the flags etc.
268 - *name*: Characters of the word name. Note that this serves for looking up words during compilation but is NOT needed for executing the code.
269 - *link* (LFA): Link to previous word in dictionary (this creates the linked list of words).
270 - *code pointer* (CFA): Pointer to the native (machine) code that's executed by this word. For example words that represent constants have a pointer to the (same) piece of machine code that pushes the constant's value -- this code is the same for all constants but, of course, the values of the constants are different -- that's what PFA is for; before executing the code, address of the PFA is pushed on stack so that the code can access the word's specific parameters. Notable case here is the colon definition (words defined with the `: ... ;` syntax) -- here the code traverses through PFA, which stores addresses of the words in the definition, and just executes each address (also pushing the return addresses on stack etc.).
271 - *parameter field* (PFA): This is a variable-length piece of memory that holds the data, the parameters for the code of this specific word -- so e.g. for the value of the constant for words that represent a constant, value of a variable for words representing variables etc. Arrays and strings also store their data here, the field is just longer. Colon definition have addresses of the words they contain here (notice that once the addresses are compiled here, we no longer need the word names).
273 Then there is a special pointer called *H* which points to the end of dictionary memory, i.e. at the end of the latest added word; adding a new word will happen here. This pointer is important e.g. for allocation: the word *ALLOT* (that allocated more memory cells for previously created pointer) just advanced the *H* pointer, making more room in the PFA. Quite clever, isn't it?
275 Forth system looks up words simply by traversing the linked list, i.e. out of words that share the same name the one created later will be found. If the system is given a word and it doesn't find it in the dictionary, it considers it a number; then it tries to parse the word as a number (using a special number parsing word which, of course, may also be redefined). This is another beautiful thing -- there is no hardwired format of a number, a number is simply anything that's not a word in the dictionary, and if for some reason we want to see say *123* as a special word rather than a number, we CAN.
277 TODO: compile time behavior, control structures, ...
282 - [Scheme](scheme.md)