Brian Connors
Last Updated 6 June 2000
This isn't a formal code annotation, really; just more of a gentle introduction in the high-demand art of implementing an RPN language in Perl. It assumes at least a passing familiarity with the idioms of RPN programming and likes to use funny FORTH words in the same context as funny other-programming-language words, sometimes interchangeably.
What you'll find here is a grab bag of interesting implementation details and backreferences to var'aq's unholy cousin bearfood, a particularly gooey mutation of Forth that was created primarily to give var'aq a coherent implementation of procedures and has since developed into an exploration of the limits of computer language grammar. Bearfood was written by Chris Pressey and named after a particularly inscrutable code snippet in the original Magenta specification, and was vital enough to the effort that it got Chris a coauthor credit on the project. You'll also see references to False by Wouter van Oortmerssen and Adobe's PostScript language, both of which have rather a lot in common with var'aq structurally.
In short: Lisp and PostScript threw a Star Trek party.
In detail, well...
That's not that easy a question. The simple answer is that var'aq was conceived as stated: a programming language that would be typical of what might be used in the Klingon culture from Star Trek. That said, I've got that point documented elsewhere. What you're really concerned with reading this document is the internals that actually execute programs written in the language; Klingonisms like the iftrue/iffalse conditionals, the log3 operator, and the like are explained, but only because they directly impact the mechanics of the language.
In plain old Earth language terms, var'aq is probably most like PostScript: an RPN-based, mostly-functional programming language. All coding revolves around the same transparent-stack paradigm that Forth and PostScript both use; variables are supported, but tend to be best limited to parameters and constants. It's very fond of anonymous lambda closures (i.e. procedures), which it uses to implement most of its control structures (no gotos -- mapping a PC to a Perl array is a waste of time) as well as its procedures.
Its spiritual cousins reside in the realm of the obfuscated programming languages (aka esoteric languages or Turing tarpits like False or (shudder) Unlambda (the nastiest functional programming language in existence and the leading cause of keyboard apostrophe wear among French guys named David); I do believe that as a speculative programming language belonging to a science-fiction culture, var'aq is the very first of its kind.
see defn() in the code
var'aq procedures are a fairly simple implementation: take tokens off STDIN and pack them into an array, then return a reference to the array and do something useful with it. This is Chris' biggest contribution, but it's not exactly what he originally gave me.
There are essentially two ways to handle a procedure, best demonstrated using C and Scheme. The key difference between these two languages is that C is a procedural language, handling everything in discrete steps and giving everything a name, while Scheme (a hothouse-flower dialect of Lisp, for those of you who don't know) aims to be a more strongly functional language. This means that a C program is set up something like a checklist, while a Scheme program more often than not will remind one of a series of components (functions) connected together like sections of an assembly line or piped together like commands in a Unix shell script. A simple hello, world function would look rather different in each language:
// hello in C
void hello() {
printf("hello, world!");
}
;; hello in Scheme
(define hello
(lambda
(print '(hello, world!))))
The key difference is Scheme's lambda
clause, which has no
equivalent in standard C. lambda closure is the technical way of
describing a self-contained code unit (that's a computer geek
simplification, by the way; a mathematician would have a more abstract
definition); it has an entry point, perhaps arguments, and a return value.
Essentially, the lambda closure is the same as the mathematical idea of a
function.
Bearfood (and FORTH, by extension) followed the C paradigm; my favorite
test function, add3
, would be rendered in Forth as ;
add3 + + . ;
, which creates a FORTH word called add3
that adds the top three numbers on the stack. This would be sufficient
except that I wanted a more flexible structure that would be in line with
the Klingon language and that I could reuse to simplify implementation of
control structures such as teHchugh/ngebchugh (iftrue/iffalse).
My original var'aq was a prime example of how not to do this; essentially, the way such a clause was executed was that the interpreter would find an iftrue or iffalse token and, depending on the condition on the top of the stack, either execute or ignore tokens until it hit a do keyword. This was a rather nasty procedure. Chris to the rescue; I took his code out of bearfood, abstracted the part that defines the procedure array, and simply passed back an array reference that could be pushed on the stack rather than storing it directly in %{$proc}, the hash table that serves as (in FORTH terms) the system's dictionary. Presto: one freshly defined anonymous procedure. Now iftrue/iffalse didn't need their do keyword anymore; I could just make them look for a reference to a procedure on the stack and execute it if appropriate. Function bindings were done with the pong/name keyword, which popped a reference and a name off the stack and stored the reference in @{$proc}.
(Incidentally, reference management in Perl is a bitch. For all that you can do quite easily in Perl, data structures becomes an extremely hairy issue in a nontrivial situation, and var'aq is nothing if nontrivial.)
I don't have a problem with the way FORTH does it; it's a readability thing. FORTH is meant first and foremost for interactive control of embedded systems (I think of it as a greasemonkey language), and to some extent constructing it the way it is constructed makes it somewhat more readable. But following the FORTH paradigm too closely would have made more work for me, and there is little honor in wasted effort. This is where var'aq has more in common with PostScript; go read the PostScript manual if you want more information.
Raw STDIN, baby. Perl is meant for batch-processing, so its native input facilities are a bit (harrumph) primitive. Mistakes mean scrambled input; type carefully. I haven't tried piping a prewritten file into the interpreter yet, since there isn't enough language there yet to justify trying to write something nontrivial, but this will be gotten around to eventually, as will a nice clean user-friendly commandline interface. But it'll be a while; string handling is a somewhat more important priority.
The best way to do it would actually be a talk-style front end that treats the interpreter like a server daemon (which will happen way down the road eventually anyway to facilitate var'aq Beowulf clusters.
I did make up the Klingon Defense Force Programming Style Guide, but Mark did make a point of adding -'a' (interrogative) to the relational operators. And he was right. That was right about when I noticed that your average relational operator is by definition what Lispers would call a predicate function.
Now as I write this I'm not sure how Lispy to get with var'aq. However, it does seem that this is essentially identical to the -p convention in Lisp, where any predicate (i.e. yes/no function) is named with a -p on the end (or a -? in Scheme and Dylan). So wherever you see a function in var'aq that asks a question, you'll see the -'a' convention. I suggest that you consider it a matter of good style to follow that lead when writing your own var'aq code.
Marc Okrand's biggest omission in the Klingon corpus is a working scientific vocabulary; I (Brian) had to create a lot of these terms from scratch. The basic premise is that though Klingon and Earth mathematics might be essentially isomorphic, the likelihood that metaphors and concepts coincide above a basic level is rather small. This section explains a few of the more obscure metaphors for the confused...
The answer is that there might be a different view of what exactly is being done when the stack is flagged like that. PostScript is literalist: a marker is being put on the stack. var'aq looks at the action more in terms of why the marker is being put there: a certain level in the stack is being staked out, like a string wrapped around a finger. Thus, remember/forget.
The idea is that var'aq is intended to have primitive support for an MPI-like interface (anyone know a good Perl-MPI binding to play with?) to simplify distributed computing designs. This is eventually intended to be part of the standard var'aq package, but as of 7 June 2000 this is so far off (think at least six months) that I have no idea what form it will take nor whether it will be built into the language or come as a separate library. Either way, a plausibly powerful network interface needs to come first, and that will not be in the initial versions either.