Another software update here – this time thoughts about implementing a filing system.
It would be far too easy to get carried away and start looking at modern systems – e.g. Unix/Linux/VMS where there are a plethora of good filing systems – the down-side there is that they need CPU cycles, memory and use more disk space for metadata that I would like, at the same time, the main filing systems in the ’70s was the CP/M one and in the late 70’s, Apple DOS 3.2/3.3. The initial BBC Micro filing system (DFS) in the early 80’s was, in some respects actually worse than CP/M and Apple DOS, although the later ADFS on the Beeb did fare a little better.
FAT
The other contender from the early 80’s is Microsofts FAT. This is now an open standard but the limitations of the filename length (8+3) irritates me and while there are ways to have long filenames, I really wanted something that did it natively.
Dots and Dollars?
As my system is accidentally becoming more BBC Micro-like by the day, then it might make sense to implement something that at least had an interface that suited the BBC Micro software of the time, but I really did not like Acorns idea of $ dot DRIVE dot directory dot filename (Really? Yes, so todays hello.c was c.hello, where c was a directory!) anyway, by the 80’s I’d already been using Unix, so …
Enter Apple ProDOS.
Round about 1983 Apple developed ProDOS as an upgrade from their DOS3.3. On the Apple II it was more than just a filing system – it had hooks into BASIC and some other stuff, however the underlying filing system part was interesting and I had looked at it back in the mid-80’s when I had thoughts to implement it on my BBC Micro at the time. So I dug out my old notes and books – Beneath Apple ProDOS and refreshed the old grey cells with it’s on-disk data format/metadata.
One thing it does reasonably well is to enable file storage relatively efficiently – if your file size is under a block (normally 512 bytes) then it needs no extra on-disk storage blocks, just a single pointer in the files catalog data. I also worked out that it would be relatively easy to change the block size too – right down to 64 bytes which would favour using the 4KB NVRam in the ATmega host processor as a storage device too.
I sketched out some ideas and decided that the on-disk format would be similar to ProDOS although the actual catalog/directory structure would more reflect the BBC Micro requirements and allow me to have up to 20 characters for the filename, although the interface would be more along the Unix open/read/write/close style with a small translation layer to accommodate any existing BBC Micro software I wanted to run. I feel that this might let me run something a little more modern than that old BBC Micro software in the future.
Sizes, shapes, ?
ProDOS uses a 16-bit pointer to address the underlying disk blocks. This gives us a maximum of 65536 blocks per device, or a raw capacity of 32MB when using 512 byte blocks. The file length limitation for ProDOS is 16MB as it uses signed 16-bit value – I removed that limitation – not that it really matters.
By todays standards, 32MB is pitifully small, however this is an 8 bit device with 64KB of RAM, so the reality is that it’s going to be just fine. (And it’ll also be OK if/when I move to a 16-bit 65816 system) I’ve decided to use a standard DOS MBR partition table, giving me up to 4 partitions of 32MB each. I think that’ll do for now.
I could increase the blocks size – going from 512 to 1024 bytes would double the capacity to 64MB, but it would also double the RAM requirements for buffers and so on and in an 8-bit micro with 64KB of RAM, every byte counts.
Media?
Did I want a real floppy drive? Not really. I didn’t want to use one of the many emulators (e.g. GoTek) either. USB is too complex to implement (for now), so I looked at CF cards which essentially have an IDE interface or SD cards which have an SPI interface and settled on SD cards – also because I have many of them and have used them frequently in Raspberry Pi projects over the past 7 years or so.
SD?
One thing I’ve learned from the Raspberry Pi systems is that SD cards are actually quite fragile. The worst thing you can do is power them off during a write. Then there’s the issue of them (potentially) wearing out. For now, I’m going to treat them as disposable items and keep good backups. This is something I advise all Raspberry Pi users to do anyway.
How do you write a filing system?
What bits do you do first? It’s been an interesting journey so-far (and it’s not complete as I write this) I started with the device block drivers – seemed simple enough – although SD cards – not quite as straightforward as they may seem, however I now have basic block read/write working well and the native block-size on an SD card is 512 bytes which fits well with the block size for the filing system. for SD cards I am also using a standard DOS MBR partition table – this gives me 4 partitions of 32MB each – it’s still laughably small compared to the 8GB SD card I’m using, but it will do for now.
I also wrote some code to let me use the 4K of internal NVRam in the ATmega host processor as a drive, and up to 8KB of RAM as a small/fast RAM disk – it was also very handy for quick testing too, although I may lose it later to free up some RAM if I decide to start caching read/writes to the SD if I need a performance boost.
Directories, drives, devices, volumes?
How do you represent a filing system with multiple devices (or SD card partitions)? CP/M and MS DOS typically use a drive letter, e.g. A or C followed by a colon, then the path to the filename. Apple DOS simply uses the drive number, so filename,d1. the Unix way is to mount devices so they look like another sub directory and appear transparent to the user. This is the approach I want, although I’ll do it all at the very top level. I’ll also use forward slashes (/) as the directory separator, so a filename will look like: /sd0/path/to/filename or /ram/filename and so on. The very top-level simply being a virtual directory with the device names, so /nvr for the NV Ram in the ATmega, /ram for the ramdisk, /rom for a ROM file system (out of the internal Flash in the ATmega) and /sd0, /sd1, /sd2 and /sd3 for the 4 partitions on the SD card.
Order…
After testing the block read/write code, the format command/functions seemed naturally the first thing to write, although when developing that, I quickly wrote a block dump command for debugging… The RAM disk was proving to be a boon here too. Format involves creating all the disk allocation bitmap blocks and the first catalog block. I decided I’d create one catalog block to start with, then add more catalog blocks as needed. On a filesystem with 64-byte blocks, there is room for just one file entry per block, but on a 512 byte block-size filesystem, there is room for 12 entries. When a catalog block is full, a new one is allocated and the first one is linked to it – in a linked list fashion. Right now there is no facility to delete empty catalog blocks, (e.g. after lots of file deletion), but I may come back to that later.
Format, then the open (create file) and write functions were next, and my disk dumper was very handy here., and at this point I started working on the Acorn MOS side of it too. The Acorn MOS has a number of calls, the primary one is OSFILE. This deals with whole files at a time and is the most efficient way (from the old BBC Micro point of view) to get data to/from a disk…. However as I’ve written a Unix style filesystem with open, and write calls, a small bit of 6502 code was needed to translate OSFILE into a combination of open, write(s), and close to write a whole file out. The MOS side is all on the 6502 and the actual filing system side of it runs on the ATmega.
It started to be a bit of a grind and my envy and admiration for the people before me who did this for the computers of the 70’s and 80’s was growing….
However, it all started to come together – read was implemented, a better catalog output then realising this is just the tip of the iceberg. Testing on SD card (and the relatively slow NVRam) shows that there are optimisations to be made, caching to be done and so on. There is also a lot of “hole in my bucket” stuff happening – so e.g. opening an existing file for writing might truncate that file to zero length under Unix, and that’s more or less expected under Acorn MOS with OSFILE, so the file truncate code needs to be written (which is really part of the file delete code), and that actually leads into the seek code – which is needed for the OSFIND, OSGBPG MOS calls, etc. then there subdirectories, traversing up and down the directory paths and other questions like sparse files? It’s all possible, but just how much do I want (or need!) to do right now… To answer this, I’m doing it “on-demand”. As I test and play with more and more old BBC Micro software, I wait for it to stall and flag up something else that needs implementing, then implement it. This isn’t ideal, but it makes sure I get everything done that needs it and allows me to incrementally test it as I go along.