Advent of Languages 2024, Day 3: Forth
Saturday, the seventh of December, A.D. 2024
M y original plan was to stick with the “systems language” theme for Day 3 and go with Zig, but the more I thought about it the more I started to think, you know, Zig is nice and clean and modern. It hasn’t had time to get all warty and ugly with bolted-on afterthoughts and contentious features that divide the community into warring tribes, and it has things like common datastructures in its standard library. I should probably save it for one of the later days, when I anticipate spending more time fighting the problem and less time fighting the language. Also I looked at Day 3 and it (the first part at least) looked very simple, which makes me even less inclined to use a big honkin’ heavy-duty language like Zig. Instead, today I’m going to take a look at Forth!
May the Forth be with you
Forth is an old language, older even than C (by a few years at least), so you know right away it’s going to be lacking a lot of modern conveniences like local variables or even, you know, structs. With named fields and all? Yep, not here.
It’s also aggresively simple, both syntactically and conceptually. Syntactially, you could fit the grammar on the back of your hand. It’s so slim that even comparison operators like <
and =
(that’s a comparison operator in Forth, like in SQL) are implemented as part of the standard library. Conceptually it’s (if anything) even simpler.
Forth is—well, I think it’s a bit of a stretch to describe it as “common” in any circumstance, but perhaps we can say “best-represented” in embedded systems contexts, where resources are often very heavily constrained and you’re typically operating very close to the hardware. In fact it thrives in an environment where there’s just a single contiguous block of not-very-much memory, because of the whole stack paradigm.
So of course, I’m going to use it for Advent of Code, where I have 32GB of RAM and a full-fat multi-process OS with extremely sophisticated thread scheduling, memory management, and so on. I wonder about myself sometimes.
Where get?
Like a lot of older languages, there isn’t a single official implementation of Forth which is your one-stop shop for compilers, linters, formatters, editor plugins, language servers, and all the various and sundry paraphernelia that have become de rigeur for a new language these days.
The front page of the Gforth website has instructions for adding the Debian repository, but unfortunately that repository doesn’t seem to be online anymore, so let’s try building it from source, for which there are also instructions on the front page. The Git repository at least is hosted at a subdomain of gnu.org
, so it might still be online?
a brief interlude
Okay, I’m back, and what do you know? It seems to have worked. There was a warning about Swig without -forth feature, library interfaces will not be generated.
, but hopefully that’s not important.
Okay, so we’re good to go, right? At least, I can run gforth
now and get a REPL, so I think we’re good.
Wait a second, I forgot to run make install
. But it’s on my PATH already! What?
$ ls -l /usr/bin/gforth
>>> lrwxrwxrwx 1 root root 12 Sep 10 2021 /usr/bin/gforth -> gforth-0.7.3
…It was already installed?!
Well, paint me purple and call me a grape if that isn’t the most embarrassing thing I’ve done all day.
The real treasure was the friends we made along the way
I hope you enjoyed this brief exercise in futility. Let’s just move on and pretend it never happened, hmmm?
Let’s start Forth
highlight.js
has no syntax for Forth, so you’ll just have to imagine the colors this time. ." hello world "
$ gforth 03.fs
>>> hello world Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
Uh. Okay, it looks like Forth doesn’t exit unless you explicitly end your program with the bye
command, it just does whatever you specified and then dumps you into a REPL so you can keep going, if you want. Hey, I wonder if giving it a non-interactive stdin would make a difference?
$ gforth </dev/null 03.fs
>>> hello world Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
Nope. I mean, it doesn’t sit there waiting for input, of course. But it does print the start message. No, the only way to keep it from doing that is to put bye
at the end of the script. Fascinating.
You may be struck by the spaces inside the quote marks here: this is necessary. Without them, for example, ."hello
would be interpreted as a single token, and since there is no such “word” (the Forth term for functions, basically) defined it would error.
Day 3
Okay, so today we have a very minimal parsing challenge. We’re given a long string of characters (just a single line this time) that contains a bunch of garbage and also some valid instructions, we have to parse out the instructions and execute them. There’s only one instruction, mul(x,y)
where x and y are numbers. Everything else is to be discarded, for Part 1 at least. I have a feeling that for Part 2 we may find out some of that garbage wasn’t quite so garbacious after all.
This would be absolutely trivial with even the most basic of regular-expression libraries, but we’re in Forth land here, so—wait. Maybe I should check some things before I make confident assertions.
…Yep, Gforth does in fact include regular expressions. Using them looks pretty wild, though. I think I need to go read some more tutorials.
…
I’m back! It’s been two days. Forth is confusing!
Data loadin’
I’ve figured out how to at least open the file and read the data into memory, though, which is Great Success in my book. Here, let me show you:
create data 20000 allot
variable fileid
s" data/03.txt" r/o open-file
drop
fileid !
variable data-len
data 20000 fileid @ read-file
drop
data-len !
After executing this with gforth
, I am dumped into an interactive Forth REPL, as before, but now I have a 20K buffer at the address specified by data
containing the contents of my puzzle input, which is great!
Forth is weird to anybody coming from the more standard type of language that has, you know, syntactic function calls and operators and all of that fancy stuff. None of that here! Remember how earlier I said that comparison operators were part of the standard library? Well yeah, it turns out that Forth doesn’t even have operators the same way other languages I’ve used do.
add(a, b)
every time you want to add some numbers.>
and =
, but really those are just conveniently-named “words” (again, the Forth name for a function) which operate on values from the stack just like other words with longer names. Crazy, huh? Anyway, I want to go through this line-by-line because it’s so far afield. Like I said, it took me 2 days to get this far. Note that because of the whole stack-oriented thing, Forth is syntactically backwards for most operations - you put the arguments first, thus putting them on the stack, then call the operation, which takes them off the stack and leaves its result (if any) on the stack. So to add 6 and 7 you would do 6 7 +
, which results in a stack containing 13. Wonky, I know.
create data 20000 allot
—Ok, I’m not actually 100% sure what this part is doing, I stole it from the gforth tutorial on files, which is… minimal, shall we say? I think what’s happening is that we’re allocating memory, but we’re “stack-allocating” (In the C sense, not the Forth sense).create data
is actually the rare Forth construct that isn’t backwards, I think because it’s a compile-time operation rather than a runtime operation: we arecreate
-ing a word,data
, whose operation is simply to retun its own address. Then we areallot
-ing 20K bytes, starting immediately after that address, I think? Anyway, the net result is that we end up with 20K of writable memory at the address returned bydata
, which is good enough for me.variable fileid
—Another non-backwards construct here,variable
. Again, I think this is because it creates a new word, which makes it a compile-time operation. I’m pretty surevariable
is actually just a layer on top ofcreate ... allot
, because it does basically the same thing, it just only reserves enough space for a single integer. (Forth’s native integer type is a signed 32-bit integer, by the way.)s" data/03.txt" r/o open-file
—Now we’re cooking with gas!s" ..."
creates a string somewhere in memory (I think it actually allocates, under the hood) and puts the address and length of that string on the stack.r/o
is just a convenience word that puts0
on the stack; it represents the mode in which we’re opening the file.open-file
opens the file, using three values from the stack: 1) the address of the string containing the name of the file, 2) the length of that string, and 3) the mode. It returns two numbers to the stack, 1) the “file id” and 2) a status code.drop
—I’m going to ignore the status code, so Idrop
it from the stack.fileid !
—At this point we have the file ID on the stack, but we want to save it for future reference, so we store it at the address returned byfileid
. The!
word is just a dumb “store” command, it takes a value and an address on the stack and stores the value at that address. The reciprocal operation is@
, which we’ll see in a moment.variable data-len
—Create a variable nameddata-len
(again, just a word nameddata-len
that returns an address where you can write data)data 20000 fileid @ read-file
—This is the goods! We are reading (up to) 20K bytes from the file whose ID is stored in the variablefileid
into the region of memory starting at addressdata
. Note thefileid @
bit, I got hung up here for the longest time because I kept trying to just usefileid
by itself. But that meant that I was givingread-file
the address of the variable itself, not the value contained at that varible. In C terms,@
is like dereferencing a pointer. (In fact, under the hood that’s probably exactly what’s going on, I think gforth is implemented largely in C.)drop
—That last operation returned the number of bytes read, which we want, and a status code, which we don’t. So we drop the status code.data-len !
—Store the number of bytes read in the variabledata-len
.
At this point we have the contents of our file in memory, hooray! Boy, Forth is low-level. In some ways it’s even lower-level than C, which is mind-boggling to me.
Wait, regular expressions solved a problem?
Ok, so now we have our input, we need to process it.
My first attempt at this was an extremely laborious manual scanning of the string, attempting to parse out the digits from valid mul(X,Y)
sequences, multiply them, keep a running total, etc. I got lost in all the complexity pretty quickly, so eventually I decided to just knuckle down and figure out the gforth regular expression library. And you know what? It wasn’t too bad after all! Here’s the regex I ended up coming up with:
require regexp.fs
: mul-instr ( addr u -- flag )
(( =" mul(" \( {++ \d ++} \) ` , \( {++ \d ++} \) ` ) )) ;
: name ... ;
is the syntax for defining a new word, by the way. In this case, the body of the word is the regex. I’m pretty sure this is required, because it’s doing its magic at compile time, again—at least, if I try using a “bare” regex in the REPL, outside of a word, I get nothing. This seems to be the deal for most Forth constructs that have both a start and an end delimiter, like if/then, do/loop, and so on. The exceptions are of course the define-a-word syntax itself, and also string literals for some reason. Not sure why that is, but I’m guessing they’re a special case because inline string literals are just so useful.
Anyway, to elaborate a bit on the regex:
((
and))
mark the start and end of the regex.=" mul("
means “match the literal stringmul(
.\(
and\)
delimit a capturing group.{++
and++}
delimit a sequence repeated one-or-more times (like+
in most regex syntaxes).- A backtick followed by a single character means to match a single occurence of that character.
So this looks for the literal string mul(
, followed by 1 or more digits, followed by the character ,
followed by 1 or more digits, followed by the character )
. Not too bad, once you get your head around it.
Oh, and referencing captured groups is just \1
, \2
, \3
etc. It seems like you can do this at any time after the regex has matched? I guess there’s global state going on inside the regex library somewhere.
Anyway, here’s my full solution to Day 3 Part 1!
create data 20000 allot \ create a 20K buffer
variable fileid \ create a variable `fileid`
s" data/03.txt" r/o open-file \ open data file
drop \ drop the top value from the stack, it's just a status code and we aren't going to bother handling errors
fileid ! \ save the file id to the variable
variable data-len
data 20000 fileid @ read-file \ read up to 20k bytes from the file
drop \ drop the status code, again
data-len ! \ store the number of bytes read in `data-len`
: data-str ( -- addr u )
\ convenience function for putting the address and length of data on the stack
data data-len @ ;
: chop-prefix ( addr u u2 -- addr2 u2 )
\ chop the first `u2` bytes off the beginning of the string at `addr u`
tuck \ duplicate `u2` and store it "under" the length of the string
- \ subtract `u2` from the length of the string
-rot \ stick the new string length underneath the start pointer
+ \ increment the start pointer by `u2`
swap \ put them back in the right order
;
require regexp.fs
: mul-instr ( addr u -- flag )
\\ match a string of the form `mul(x,y)` where x and y are integers and capture those integers
(( =" mul(" \( {++ \d ++} \) ` , \( {++ \d ++} \) ` ) )) ;
: get-product ( addr u -- u2 )
mul-instr \ match the string from `addr u` against the above regex
if \ if the regex matches, then:
\1 s>number drop \ convert the first capture from string to number, drop the status code (we already know it will succeed)
\2 s>number drop \ convert the second capture from string to number, drop the status code
* \ multiply, and leave the answer on the stack
else
0 \ otherwise, leave 0 on the stack
then
;
variable result \ initialize `result` with 0
0 result !
: sum-mul-instrs ( addr u -- u2 )
begin \ start looping
s" mul(" search \ search for the string "mul("
if \ if successful, top 2 values on stack will be start address of "mul(" and remainder of original string
2dup \ duplicate address and remaining length of string
get-product \ pass those to get-product above
result @ + \ load `result` and add to product
result ! \ store this new value back in `result`
4 chop-prefix \ bump the start of the string by 4 characters
else \ if not successful, we have finished scanning through the string
2drop \ dump the string address and length
result @ exit \ put the result on top of the stack and return to caller
then
again
;
data-str sum-mul-instrs .
bye
Not exactly terse, but honestly it could have been a lot worse. And my extremely heavy use of comments makes it look bigger than it really is.
The ( addr u -- u2 )
bits are comments, by the way. By convention when you define a word that either expects things on the stack or leaves things on the stack, you put in a comment with the stack of the stack in ( before -- after )
format.
Part 2
Onward and upward! In Part 2 we discover that yes, in fact, not quite all of the “garbage” instructions were truly garbage. Specifically, there are two instructions, do()
and don't()
which enable and disable the mul(x,y)
instruction. So once you hit a don't()
, you ignore all mul(x,y)
instructions, no matter how well-formed, until you hit a do()
again.
Easy enough, but I’m going to have to change some things. Right now I’m using the search
word to find the start index of every possible mul(
candidate, then using the regex library to both parse and validate at the same time. Obviously I can’t do that any more, since now I have to search for any of three possible constructs rather than just one.
I spent quite a while trying to figure out how to get the regex library to spit out the address of the match it finds, but to no avail. There are some interesting hints about a “loop through all matches and execute arbitrary code on each match” functionality that I could probably have shoehorned into what I needed here, but in the end I decided to just scan through the string the old-fashioned way and test for plain string equality at each position. In the end it came out looking like this:
...
variable enabled
-1 enabled ! \ idiomatically -1 is "true" (really anything other than 0 is true)
: handle-mul ( addr u -- )
get-product \ pass those to get-product above
result @ + \ load `result` and add to product
result ! \ store this new value back in `result`
;
: sum-mul-instrs ( addr u -- u2 )
\ we want to loop from addr to (addr + u - 8), because 8 is the min length of a valid mul(x,y) instruction
\ we also want to have addr + u on the top of the stack when we enter the loop,
\ so that we can use that to compute the remaining length of the string from our current address
over + \ copy addr to top of stack and add to length
dup 8 - \ duplicate, then subtract 8 from the top value
rot \ move original addr to top of stack
( stack at this point: [ addr + u, addr + u - 8, addr ] )
( i.e. [ end-of-string, loop-limit, loop-start ] )
do \ start looping
I 4 s" do()" str= \ compare the length-4 substring starting at I to the string "do()"
if \ if valid do() instruction,
-1 enabled ! \ set enabled=true
then
I 7 s" don't()" str= \ compare length-7 substring to "don't()"
if \ if valid don't() instruction,
0 enabled ! \ set enabled=false
then
I 4 s" mul(" str= \ compare length-4 substring to "mul("
enabled @ and \ combine with current value of `enabled`
if \ if a candidate for `mul(x,y)` instruction, and enabled=true, then
dup I - \ subtract current string pointer from end-of-string pointer to get length of remaining string
I swap handle-mul \ put current pointer onto stack again, swap so stack is ( addr len), and handle
then
loop
drop \ get rid of end-of-string pointer
result @ \ return value of result
;
s" data/03.txt" load-data
sum-mul-instrs .
bye
Oh yeah, I also decided to extract load-data
into a word and pass in the filename, to make it easier to switch between test and live data. The whole thing is here if you’re interested.
I’m actually surprised by how satisfied I am with this in the end. It’s not exactly what I would call pretty, but it’s reasonably comprehensible, and I feel like I’m starting to get the hang of this stack-manipulation business. It could definitely be more efficient - I’m still looping over every index in the string, even when I know I could skip some because, say, they already validated as being the start of a known instruction. Also, I should really avoid testing for subsequent instructions every time one of the prior ones validates. I just couldn’t bring myself to do nested if statements because they look like this, which is horrible.
Nobody asked for my opinion, but here it is anyway
So what do I think of Forth? It’s certainly interesting! It’s a very different way of approaching programming than I’ve encountered before. But I don’t know that I’d want to use it for a serious project, because it’s pretty lacking in the code-organization department.
A lost of older languages have shortcomings with regard to things like namespaces, good facilities for spreading code across multiple files (and complex file hierarchies), and tools for building on other peoples’ code. Forth has those problems too, but they aren’t really fundamental. It’s easy to imagine a version of Forth with some namespacing, a package manager, etc.
More specifically, it’s the fact that there’s only one stack, and it’s shared between every code-unit in the entire program. This might be ok if it were used exclusively for passing data around between different code units, but it isn’t. From my limited experience, I get the impression that the stack is expected to be used for most values. Sure, there are variables, but the amount of ceremony involved in using them makes it feel like Forth doesn’t really want you to use them heavily. Plus, they’re all global anyway, so they’re hardly a help when it comes to code organization.
The problem with the stack is, as I said, that it’s shared, and that everybody has it. That means that if you’re storing something on the stack, and you invoke a word, it might just mess with that something on the stack. Sure, maybe it isn’t supposed to, but you know, bugs happen. The history of computer science is arguably a long and mostly-fruitless quest in search of programming paradigms that result in fewer bugs. “Just trust me bro” is not a useful approach to code encapsulation in complex projects.
Sure, you may say, but a function in any language can return incorrect values, so what’s the big deal? Yes, that’s true, but in most languages a function can’t return too many values, or too few. If it does, that’s either a compile-time error or an immediate runtime error, meaning the error occurs at the point the misbehavior occurred. This is critical, because errors get harder and harder to debug as their “skid distance” increases—i.e. the distance between the root cause of an error and the point at which that error actually manifests.
Even more worrisome, as I alluded to previously, is the fact that the stack makes it possible for words to mess with things that are (from the programmer’s point of view) totally unrelated to them. You could end up with a situation like this:
put-some-memory-address-on-the-stack
17 do-something-unrelated
\ an error in do-something-unrelated causes it to delete the memory address from the stck
do-more-things
...
attempt-to-access-missing-address \ ERROR! What happened? Why isn't that address where I expected it to be?
This is the sort of thing that causes people to get woken up at 3AM to deal with production outages. The entire class of purely-functional languages exists on the thesis data being twiddled with by bits of code that aren’t supposed to is so bad that it’s better to disallow all data-twiddling forever, full stop. I shudder to think what would happen if you dropped a Haskellite into a Forth world. He’d probably just keel over and die on the spot.
So what is it actually good for?
The short answer is, embedded systems. That has traditionally been the wheelhouse of Forth, and as far as I can tell it continues to be so insofar as Forth is still used at all for real live projects.
This makes a lot of sense, when you think about it. Embedded code is often:
- Very tightly focused, concerned with just doing a few specific things under all circumstances (as contrasted with, say, your typical webapp which might have to add file uploads or start interfacing with some remote API any day of the week)
- Resource-constrained, particularly where memory is concerned (CPU time is usually less of a concern)
- Developed by a small team, often just a single person
- Fire-and-forget, there’s no “ongoing maintenance” when you would have to perform a surgery or recall a satellite from orbit to do a firmware upgrade
All of this works to minimize the downsides of Forth’s lack of organizational capabilities. Organizing is easier the less there is to organize, of course, and when there’s only one or a few people doing the organizing. And when it’s all done in one shot—I can’t count the number of times I’ve come back to code I wrote after a year or two or three and spent the day going “What was I thinking?” while trying to unravel the tangled web I wove. That sort of thing is much less likely when your code gets deployed as part of Industrial Sewage Pump Controller #BS94A, because who wants to go swimming in sewage if they don’t have to?
The other reason I suspect that Forth has been so successful in embedded contexts is that it’s a lot easier to implement than most languages. This is true of stack-oriented languages in general, I think—there’s a reason that a lot of VMs are stack-based as well—and in an embedded context, if you want a runtime of any kind you often have to build it yourself. I don’t know this for sure, but I wouldn’t be surprised if Forth got used for a lot of embedded stuff back in the day because it was the most feasible language to implement in raw assembly for the target architecture.
Of course, I’ve constructed this tottering tower of linguistic philosophy on the basis of a few days’ playing with Forth when I’ve had time, so take what I say with a few spoonfuls of salt. There are people out there who will claim that they used to prefer C to Forth, but after enough time with Forth their eyes were opened and their mind ascended to the heights, and they spoke with the tongues of the cherubim and painted with the colors of the wind—
Sorry, that got a little out of hand. My point is, some people like Forth even for relatively complex tasks like an OS kernel.