So just how do you go about porting BCPL to a new platform?
The Bootstrap Paradox…
The BCPL compiler is written in BCPL. It compiles to cintcode. Cintocde is interpreted (at least on my Linux desktop) by a C program. This C program also handles “system” calls that the cintcode can call – these are for things outside the BCPL environment – like character and file IO and memory allocation.
The memory allocation is the paradox. BCPL has a function called getvec(). This is essentially the same as malloc() in the C/Unix world. In the desktop BCPL system this function does nothing but make a system call to the C interpreter which then has a C version of getvec() which ultimately uses malloc(). Ruby doesn’t have malloc() and I want to do it all in BCPL. However to load the very first piece BCPL cintcode that the system needs (the code to make a system call), then something needs to call getvec() to allocate memory to load the cintcode code of the BCPL version of getvec() into …
So where do you start…
Right now I have a C program with a C version of getvec() that initialises store, calls its own getvec() to allocate memory and loads in from the filing system the necessary compiled modules to bootstrap BCPL. This is somewhat ironically paradoxical, given that C came after BCPL and now we’re using C to bootstrap BCPL…
Ultimately I will get rid of this C “shim” and do the bare minimum in ‘816 assembler before jumping into BCPL land, then do it all from BCPL, but for now the C program remains. It has compiled to about 16KB in size – which is at least 10x bigger than most of the BCPL library combined.
I started with the cintcode interpreter with nothing but the instruction fetch and decode. I pointed all 256 instructions to a generic “unimplemented opcode” routine that prints the opcode and program counter… Then I just started to feed it small compiled BCPL programs and implemented the opcodes one at a time. Fortunately some opcodes are in groups – such a the “load a small constant” opcode – there are 11 of those to load from 0 to 10, and so on. The most complex opcode appears to be one of the two opcodes that deals with the switch command – the compiler outputs a set of pairs of values and relative labels ordered in an efficient manner (binary tree) which is a bit of a challenge to work through in ‘816 assembly code. Fortunately macros have made a lot of things easier.
Anyway, once the BCPL run-time system is going, it more or less takes over and the C program is left as nothing more than a bootstrap loader which can be overwritten. (Although it was and still-is also good as a testing harness while I was writing the code to handle all the cintcode opcodes)
I have enough cintcode written to now run the compiler and here is a short video demonstrating it and compiling and running a simple Hello World program: