FIB in PowerPC assembly and in JONESFORTH (eighty-twenty news)

Recently I’ve been teaching myself PowerPC assembly through porting JONESFORTH to PowerPC on Mac OS X. It struck me to run the same little fibonacci-sequence microbenchmark that I ran lo these many years past. The results were interesting:

Language	Implementation Detail	Time (per `(fib 29)` call, in milliseconds)	Ops/s	Ratio (opt. C)	Ratio (unopt. C)
PPC assembly	-	24	935983000	0.43	0.205
FORTH	JONESFORTH ported to PPC	277	81096000	4.95	2.37

The hand-coded assembly beats all the other entrants (perhaps unsurprisingly). The naive indirect-threaded FORTH is the fastest interpreted language, merely 5 times slower than fully optimised C.

Here’s the JONESFORTH code:

: FIB DUP 2 >= IF 1- DUP RECURSE SWAP 1- RECURSE + ELSE DROP 1 THEN ;

and here’s the PPC assembly (arg and result in r3):

_SFIB:  cmpwi   r3,2
        bge     1f
        li      r3,1
        blr
1:      mflr    r0
        stw     r0,-4(r1)
        addi    r3,r3,-1
        stwu    r3,-8(r1)
        bl      _SFIB
        lwz     r4,0(r1)
        stw     r3,0(r1)
        addi    r3,r4,-1
        bl      _SFIB
        lwz     r4,0(r1)
        add     r3,r3,r4
        lwz     r0,4(r1)
        addi    r1,r1,8
        mtlr    r0
        blr