FIB in PowerPC assembly and in JONESFORTH
Recently I've been teaching myself PowerPC assembly through porting JONESFORTH to PowerPC on Mac OS X. It struck me to run the same little fibonacci-sequence microbenchmark that I ran lo these many years past. The results were interesting:
| Language | Implementation Detail | Time (per (fib 29) call, in milliseconds) | Ops/s | Ratio (opt. C) | Ratio (unopt. C) |
|---|---|---|---|---|---|
| PPC assembly | - | 24 | 935983000 | 0.43 | 0.205 |
| FORTH | JONESFORTH ported to PPC | 277 | 81096000 | 4.95 | 2.37 |
The hand-coded assembly beats all the other entrants (perhaps unsurprisingly). The naive indirect-threaded FORTH is the fastest interpreted language, merely 5 times slower than fully optimised C.
Here's the JONESFORTH code:
: FIB DUP 2 >= IF 1- DUP RECURSE SWAP 1- RECURSE + ELSE DROP 1 THEN ;
and here's the PPC assembly (arg and result in r3):
_SFIB: cmpwi r3,2
bge 1f
li r3,1
blr
1: mflr r0
stw r0,-4(r1)
addi r3,r3,-1
stwu r3,-8(r1)
bl _SFIB
lwz r4,0(r1)
stw r3,0(r1)
addi r3,r4,-1
bl _SFIB
lwz r4,0(r1)
add r3,r3,r4
lwz r0,4(r1)
addi r1,r1,8
mtlr r0
blr