The first thing to be determined about branch delay slots is how long they are.
Rereading our previous experiments with branch instructions with our newfound knowledge about bundles strongly suggests that the delay slots are always exactly 1 bundle long. Still, it'd be better to do a proper test.
Like the bundling test, we'll check each combination of instruction types for 8 slots. Now for each such combination we'll also try stuffing a call instruction into each of the first 4 slots that got assigned a branch type. Using a call instruction instead of a branch instruction means that the address after the delay slot (the return address) will end up in 0xf500 MMIO register, so we don't have to do a register writing dance to check whether any given instruction has been executed.
And well... test passed. That was quick.
The second thing to be determined is what happens if you stuff a branch instruction into the delay slot of a previous branch instruction (assuming both are taken). There are several possibilities:
Let's check that. We'll start by emitting two branches back to back: one to 0x40, second to 0x80. After these branches, and at the two targets, we'll add sequences of mov-to-$r mark instructions followed by exit.
The result: none of the instructions right after the branches execute, one instruction at 0x40 executes, all instructions at 0x80 execute. So, VP1 matches option 1.
Elapsed time: 2h.Share on Twitter Share on Facebook