Given that we now have a simple and reliable way of executing a single instruction, it's time to start using automated testing to verify we got the semantics right.
We'll integrate the VP1 tests as part of envytools' hwtest framework. The test will work like that:
To start off, we'll consider just scalar instructions. Our context will consist of just $r and $c registers. First, we'll limit our RNG to generate only 0x65 (mov immediate) instructions.
Done. Test passed (not surprising: the instruction is dead simple and there aren't even any unknown spaces in opcode). Let's try 0x64 (immediate or) now.
Getting perfect score on $r write is quite easy, but $c is a bit more problematic. After some trial and error, the following rules give perfect score for 0x64 $c flag setting:
uint32_t res = 0; if (!val) res |= 2; if (val & 1 << 18) res |= 0x80; if (val & 1 << 19) res |= 0x44; if (val & 1 << 20) res |= 0x10; if (val & 1 << 21) res |= 0x20;
We'll extend it to cover ops 0x66 and 0x67 now, let's see what they do.
Apparently nothing at all. Expanding to ops 0x62-0x63 (immediate and/xor) gives no surprises ($c algorithm is same as for or). 0x60 seems to be another nop.
0x61 seems interesting: the results are, again, alright, but there are mismatches on $c bits 0 and 3. Bit 0 appears to be plain old sign bit (apparently bitwise operations don't affect it). Bit 3 appears to be XOR of bit 20 of result and bit 20 of first source.
Extending again to 0x68 and 0x69 (min/max with immediate). They seem to have the same $c semantics as multiplication.
We'll skip 0x6a and 0x6b for now, to avoid dealing with special registers. They'll get their own test later.
Ops 0x6c and 0x6d have no surprises (same $c semantics as mul again). 0x6f doesn't do anything. 0x6e is a bit more interesting. Turns out the shift works as follows:
And $c semantics are as per add/mul/etc. And 0x7e is the same.
The one known 0x7X instruction remaining is 0x75, sethi. Nothing surprising about it.
Let's take a look at the remaining 0x7X values. 0x71 and 0x78-0x7d appear to do something, others are nops.
0x71, 0x78, 0x79, 0x7c, 0x7d appear to be equivalent to 0x61, 0x68, 0x69, 0x6c, 0x6d respectively. No fun.
0x7a and 0x7b look more interesting. Running them on all input values in -5..5 range quickly reveals they're abs(src1) and -src1, respectively. 0x7a has $c semantics like add/mul, while 0x7b has similiar semantics, but bit 3 is simply set to bit 20 of the result.
And that covers all of 0x6X/0x7X block, except 0x6a/0x6b.
Elapsed time: 2.5hShare on Twitter Share on Facebook