In 1974 Donald Knuth said:
Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
So having a few days of diversions due to a bit of brain-fug – playing with LCD displays (more on that later), and trying to speed-up my cintcode interpreter…
Mostly by taking out a lot of the debug which has made a nice speed-up but fiddling with some math code…
So in my fixed-point Mandelbrot code I was avoiding divides by powers of 2 by using shifts. So-far so-good. One case didn’t work because it was signed and logical shifting doesn’t sign extend, so..
The original is:
y := ((x*y) / 2048)+cy
changed to:
y := ((x*y)>> 11)+cy
which didn’t work due to lack sign extension, so I hacked in this:
LET q=? ... q := x*y TEST q < 0 THEN y := (-(-q >> 11)) + cy ELSE y := (q >> 11) + cy
which worked a treat and was much quicker than my existing divide code which takes some 2000 CPU cycles to execute.
Then I thought… “wouldn’t it be good to have an extra function that did a correct sign extend arithmetic shift” … so chasing that 3%, I coded up one via the general purpose sys() interface, so it became:
y := sys (Sys_arsh, x*y, 11) + cy
and while it worked correctly, it was slower.
It seems the overhead of a function call plus the internals of the sys() function (written in pure 65816 assembly) plus the sys() dispatcher is slower than a simple test and a line or 2 of generic BCPL code.
So that’s my morning spoilt. Time for another coffee I think!