3: $a instructions, pt. 1

(18 comments)

No obvious place to go now. Let's try some statistical analysis. We'll count how many times each opcode [as distinguished by top 8 bits] is used.

op NV44 NV50
0x04 1026 -
0x0b 26 2
0x0c 96 -
0x0d 288 -
0x0f 526 255
0x24 364 336
0x40 - 766
0x41 1439 1091
0x42 4233 3156
0x45 3 3
0x48 147 123
0x4c 4999 3653
0x4d 1878 1217
0x4f 10747 4331 nop?
0x5e 50 8
0x62 3390 2610
0x63 48 10
0x64 348 266
0x65 7634 4929 mov $a simm19
0x68 390 288
0x69 236 192
0x6a 3359 2640
0x6b 3203 2516
0x6c 3045 2424
0x6e 8013 6040
0x75 3323 2415
0x7e 534 450
0x80 33 25
0x81 1 1
0x82 16 16
0x83 26 26
0x84 199 188
0x85 116 92
0x86 111 106
0x87 48 42
0x8a 26 10
0x8b 7 7
0x8c 27 27
0x8d 3 3
0x8f 13 13
0x90 8 -
0x91 48 48
0x92 13 4
0x94 139 139
0x95 272 217
0x97 147 142
0x98 21 5
0x99 21 5
0x9b 740 555
0x9c 43 33
0x9d 14 4
0x9f 76 76
0xa4 10 10
0xa5 3 3
0xaa 14 14
0xab 29 29
0xac 129 129
0xad 936 822
0xae 14 14
0xb1 - 20
0xb4 234 180
0xb5 276 200
0xb6 388 290
0xb7 128 96
0xb8 5 -
0xbc 55 55
0xbe 35 35
0xbf 11118 5148 nop?
0xc0 65 35
0xc1 67 67
0xc2 136 8
0xc3 253 217
0xc4 50 50
0xc5 49 49
0xc6 134 6
0xc7 724 141
0xc8 640 481
0xc9 618 466
0xca 528 502
0xcb 2396 1550
0xcc 2486 1483
0xcd 1960 1058
0xce 449 372
0xcf 507 369
0xd0 928 869
0xd1 655 575
0xd2 176 368
0xd3 359 414
0xd4 1021 844
0xd5 963 694
0xd6 142 409
0xd8 712 484
0xd9 98 93
0xda 4221 2967
0xdb - 37
0xdc 800 552
0xdd 204 116
0xde 7836 26102
0xdf 9333 4072 nop?
0xe0 2897 2330 branch ?
0xe1 1036 913 branch ?
0xe2 1707 1188 branch ?
0xe3 120 89
0xe4 2020 695 branch ?
0xe6 92 84
0xe8 694 552
0xea 248 1 abra
0xef 23932 13521 nop?
0xf0 161 329
0xff 918 642 exit [intr] imm16

We'll try out some of the most common opcodes. Since it's very likely that high 3 or 4 bits of an opcode determine its execution unit, we'll aim for opcodes 0x6X and maybe 0x7X, which will likely operate on $a register file.

The first one is 0x6e.

First, let's replace the entire test code by 0x6edeadbe, exit 0, and lots of nops. We'll also put the test code only in microcode slot 0 [464], filling the others with just exit+nops, so that the instruction executes only once.

Result:

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a 5b7c3400
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Register $a27 got modified, consistent with the destination field. No idea how it came up with the value, though. If our previous analysis is any good, the instruction will have source bitfields at positions 14 and 9, ie. $a26 and $a22. The obvious candidate for the most common opcode is add, which would give 0xbd5b7c30. While this is far from the mark, it does have a large common bitfield. Could be an add with a shift and a constant offset. Weird, but the remaining unknown bitfield is large enough for that [9 bits].

Let's try the same, but with setting the low opcode bits to 0 (0x6edeac00).

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbe1a
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Looks like a simple mov from $a26. Maybe $a22 isn't used then. Actually, 0x5b7c3400 is just 0xdeadbe1a << 9... ok, so that'd be a shift-left instruction with shift amount given in the low bits. Let's try to pinpoint that.

6edeac00: deadbe1a
6edeac01: deadbe1a
6edeac02: deadbe1a
6edeac04: deadbe1a
6edeac08: ef56df0d
6edeac10: f7ab6f86
6edeac20: fdeadbe1
6edeac80: ffffdead
6edead00: deadbe1a
6edead08: 00000000
6edead10: 80000000
6edead20: a0000000
6edead40: 1a000000
6edead80: be1a0000
6edeadbe: 5b7c3400

That'd be a signed shift right by a *signed* 6-bit immediate in bits 3-9 (with negative values effectively resulting in shifts left). What are the other bits for, then? Let's mess with them.

6ede81be: 5b7c3400
6ede81b8: 5b7c3400
6edebfbc: 5b7c3400
6edebfbf: 5b7c3400

So they don't affect the result. It's likely the instructions set some sort of condition flags, however. Let's look for them.

6ede81b8:
0000f680: 000080fc 00008000 00008000 00008000
6edebfbf:
0000f680: 00008000 00008000 00008000 00008000

There they are, apparently. For some reason there appear to be four of them, even! We'll call them $c0-$c3. Let's try some more combinations.

6ede81b8: 000080fc 00008000 00008000 00008000
6ede81b9: 00008000 000080fc 00008000 00008000
6ede81ba: 00008000 00008000 000080fc 00008000
6ede81bb: 00008000 00008000 00008000 000080fc
6ede81bc: 00008000 00008000 00008000 00008000
6ede81bd: 00008000 00008000 00008000 00008000
6ede81be: 00008000 00008000 00008000 00008000
6ede81bf: 00008000 00008000 00008000 00008000

At this point, one thing is obvious: bits 0-1 select the $c destination, while bit 2 is likely a $c write enable. Let's have a shot at figuring out the $c bitfields. There are 6 bits set, which is more than the usual 4: zero flag, sign flag, carry flag, signed overflow flag.

First, the zero flag, sign flag, and any other flags that get set depending only on the arithmetic result (as opposed to eg. the carry-out flag). Let's write various values to $a0 and run opcode 0x6e000000 (effectively a nop with flag setting).

Results:

00000000: 00008002
00000001: 00008000

So far, so good. bit 1 is the zero flag.

ffffffff: 000080f5

Here is where it gets complicated.

80000000: 00008001

Ok, if anything is a sign flag, it's bit 0.

aaaaaaaa: 00008065

Strange. Let's tread carefully.

00000002: 00008000
00000003: 00008000
00000004: 00008000
00000008: 00008000
00000010: 00008000
00000020: 00008000
00000040: 00008000
00000080: 00008000
00000100: 00008000
00000200: 00008000
0000dead: 00008000
0000ffff: 00008000

Hm, nothing.

000fffff: 000080c4
00ffffff: 000080f4
0fffffff: 000080f4
7fffffff: 000080f4
deadbeef: 000080e5

Bits 16-23?

ff00ffff: 00008001
00010000: 00008000
00020000: 00008000
00040000: 00008080
00080000: 00008044
00100000: 00008010
00200000: 00008020
00400000: 00008000
00800000: 00008000
5555aaaa: 00008090

I'm not sure I want to know how it works anymore... bits 2 and 6 are set to bit 19, bit 4 to bit 20, bit 5 to bit 21, bit 7 to bit 18. Maybe it'll make more sense later.

The last flag to find is the carry flag, commonly used for the shifted-out bit on shift instructions. Easy, let's change the $a0 value to 0xdeadbeee and 0xdeadbeef, and the instruction to 0x6e000008.

deadbeee >> 1: ef56df77, flags 00008099
deadbeef >> 1: ef56df77, flags 00008099

No luck, but let's see what that bit 3 is about.

deadbee0 >> 1: ef56df70, flags 00008099
dead0000 >> 1: ef568000, flags 00008099
00ad0000 >> 1: 00568000, flags 00008098
00a00000 >> 1: 00500000, flags 00008018
00200000 >> 1: 00100000, flags 00008018
00400000 >> 1: 00200000, flags 00008020
00800000 >> 1: 00400000, flags 00008000

Screw it. I'm definitely leaving it for later.

This leaves bits 9-14. Lots of testing reveals no changes. Let's see the stats in real microcode:

mwk@nightmare ~/microcode/vp1/nv50 $ cat p* | grep 0x6e | sort | uniq | envydis -m vp1 -w
00000000: 6e000007 sar $a0 $a0 0
00000000: 6e0000c7 sar $a0 $a0 0x18
00000000: 6e0000e7 sar $a0 $a0 0x1c
00000000: 6e003ff7 sar $a0 $a0 -0x2 [unknown: 00003e00]
00000001: 6e008007 sar $a0 $a2 0
00000001: 6e00bf47 sar $a0 $a2 -0x18 [unknown: 00003e00]
00000001: 6e00bfdf sar $a0 $a2 -0x5 [unknown: 00003e00]
00000001: 6e00bfe7 sar $a0 $a2 -0x4 [unknown: 00003e00]
00000002: 6e00bfef sar $a0 $a2 -0x3 [unknown: 00003e00]
00000002: 6e00bff7 sar $a0 $a2 -0x2 [unknown: 00003e00]
00000002: 6e00bfff sar $a0 $a2 -0x1 [unknown: 00003e00]
00000002: 6e0100c7 sar $a0 $a4 0x18
00000003: 6e027fdf sar $a0 $a9 -0x5 [unknown: 00003e00]
00000003: 6e084007 sar $a1 $a1 0
00000003: 6e08407f sar $a1 $a1 0xf
00000003: 6e087f17 sar $a1 $a1 -0x1e [unknown: 00003e00]
00000004: 6e087f8f sar $a1 $a1 -0xf [unknown: 00003e00]
00000004: 6e087ff7 sar $a1 $a1 -0x2 [unknown: 00003e00]
00000004: 6e0940c7 sar $a1 $a5 0x18
00000004: 6e1080c7 sar $a2 $a2 0x18
00000005: 6e10bf87 sar $a2 $a2 -0x10 [unknown: 00003e00]
00000005: 6e10bfc7 sar $a2 $a2 -0x8 [unknown: 00003e00]
00000005: 6e10bff7 sar $a2 $a2 -0x2 [unknown: 00003e00]
00000005: 6e10bfff sar $a2 $a2 -0x1 [unknown: 00003e00]
00000006: 6e1180c7 sar $a2 $a6 0x18
00000006: 6e13002f sar $a2 $a12 0x5
00000006: 6e157fd7 sar $a2 $a21 -0x6 [unknown: 00003e00]
00000006: 6e15ffd7 sar $a2 $a23 -0x6 [unknown: 00003e00]
00000007: 6e18c087 sar $a3 $a3 0x10
00000007: 6e18c0c7 sar $a3 $a3 0x18
00000007: 6e18fff7 sar $a3 $a3 -0x2 [unknown: 00003e00]
00000007: 6e18ffff sar $a3 $a3 -0x1 [unknown: 00003e00]
00000008: 6e1b402f sar $a3 $a13 0x5
00000008: 6e210007 sar $a4 $a4 0
00000008: 6e314037 sar $a6 $a5 0x6
00000008: 6e318007 sar $a6 $a6 0
00000009: 6e31800f sar $a6 $a6 0x1
00000009: 6e318037 sar $a6 $a6 0x6
00000009: 6e318047 sar $a6 $a6 0x8
00000009: 6e318087 sar $a6 $a6 0x10
0000000a: 6e31bfdf sar $a6 $a6 -0x5 [unknown: 00003e00]
0000000a: 6e33c037 sar $a6 $a15 0x6
0000000a: 6e387f17 sar $a7 $a1 -0x1e [unknown: 00003e00]
0000000a: 6e39800f sar $a7 $a6 0x1
0000000b: 6e39bfff sar $a7 $a6 -0x1 [unknown: 00003e00]
0000000b: 6e39c087 sar $a7 $a7 0x10
0000000b: 6e39c0a7 sar $a7 $a7 0x14
0000000b: 6e39c0c7 sar $a7 $a7 0x18
0000000c: 6e39c0e7 sar $a7 $a7 0x1c
0000000c: 6e39ffdf sar $a7 $a7 -0x5 [unknown: 00003e00]
0000000c: 6e3b3fc7 sar $a7 $a12 -0x8 [unknown: 00003e00]
0000000c: 6e41800f sar $a8 $a6 0x1
0000000d: 6e41ff47 sar $a8 $a7 -0x18 [unknown: 00003e00]
0000000d: 6e420047 sar $a8 $a8 0x8
0000000d: 6e423f87 sar $a8 $a8 -0x10 [unknown: 00003e00]
0000000d: 6e423ff7 sar $a8 $a8 -0x2 [unknown: 00003e00]
0000000e: 6e433fc7 sar $a8 $a12 -0x8 [unknown: 00003e00]
0000000e: 6e44ffc7 sar $a8 $a19 -0x8 [unknown: 00003e00]
0000000e: 6e457fc7 sar $a8 $a21 -0x8 [unknown: 00003e00]
0000000e: 6e45ffc7 sar $a8 $a23 -0x8 [unknown: 00003e00]
[...]

... and so on. So left shifts have 11111 in this bitfield, right shifts have 0. Ok, we'll try both left and right shifts now, separately.

No, still nothing.

Enough for the shifts, let's try some other common-looking opcodes. 0x62, 0x6a, 0x6b, 0x6c, 0x75 have about equal statistics. Let's begin with 0x62, then. 0x62deadbe:

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbc12
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Strange enough. Destination works, but who knows what the operation or sources were. Let's write 0 to $a26:

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 00000000 00000000
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

... and to $a22:

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 00000000 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbc12
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

One source again? Let's check for $c writes too. 0x62deadba:

0000f680: 00008000 00008000 000080e4 00008000

Sure enough. Ok, let's try varying the source value now:

deadbeef -> deadbca7 flags 000080e4
00000000 -> 00000000 flags 00008002
ffffffff -> fffffdb7 flags 000080f4

Looks like a bitwise and operation with 0xfffffdb7. Which is a sign-extension and one-extension of bits 3-14. Let's just verify it's a sign-extension and not one-extension, changing the op to 0x62000db8:

ffffffff -> 000001b7 flags 00008000

Okay, that's it. This explains the unknown bits in shift instruction too - the signed immediate field is likely the same size in both instructions, shift just happens to ignore the higher bits. But, we can also notice the and instruction doesn't set the sign flag for some reason. Not good.

Moving on, 0x6a. 0x6adeadbe:

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbe1b
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

No change... Let's try with $c write, 0x6adeadba:

0000f680: 00008000 00008000 00008000 00008000

Nothing. A likely explanation is that 0x6a is the store instruction. Let's try 0x6b instead. 0x6bdeadba:

0000f680: 00008000 00008000 00008000 00008000
...
0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a 00000000
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

A zero. Let's try zeroing out both $a26 and $a22:

0000f680: 00008000 00008000 00008000 00008000
...
0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 00000000 deadbe17
0000f7e0: deadbe18 deadbe19 00000000 00000000
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Um, 1 and 2?

0000f780: deadbe00 deadbe01 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 00000001 deadbe17
0000f7e0: deadbe18 deadbe19 00000002 00000000
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Maybe the unknown bits are disturbing the operation? Let's try 0x6b004000 on arguments 1, 2:

0000f680: 00008000 00008000 00008000 00008000
...
0000f780: deadbea0 00000002 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbe1b
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Ok, something happened. Note that the test program writes deadbea0 to component 1 of [suspected] $v0. Let's modify the component to something else to verify:

0000f000: deadbe80 12345678 deadbec0 deadbee0
[...]
0000f780: 12345678 00000002 deadbe02 deadbe03
0000f790: deadbe04 deadbe05 deadbe06 deadbe07
0000f7a0: deadbe08 deadbe09 deadbe0a deadbe0b
0000f7b0: deadbe0c deadbe0d deadbe0e deadbe0f
0000f7c0: deadbe10 deadbe11 deadbe12 deadbe13
0000f7d0: deadbe14 deadbe15 deadbe16 deadbe17
0000f7e0: deadbe18 deadbe19 deadbe1a deadbe1b
0000f7f0: deadbe1c deadbe1d deadbe1e 00000000

Good enough, I guess. So what if I change the source regs?

$a0 $a1
00000001 00000002 -> deadbea0
00000003 00000002 -> deadbea0
00000003 00000003 -> deadbea0
deadbeef deadbeef -> deadbea0

Guess they don't select $a regs, then. Let's try changing some bits in the instruction instead.

0x6b000000: deadbe80
0x6b004000: deadbea0
0x6b008000: deadbec0
0x6b00c000: deadbee0
0x6b010000: deadbe81
0x6b020000: deadbe82
0x6b040000: deadbe84

Ah, so I got it wrong... 0xf000:0xf080 is component 0 of $v0-$v31, 0xf080:0xf100 is component 1, and so on.

0x6b002000: deadbe80
0x6b001000: deadbe80
0x6b000800: deadbe80
0x6b000400: deadbe80
0x6b000200: deadbe80
0x6b000100: deadbe80
0x6b000080: deadbe00
0x6b000040: 00000000
0x6b000020: deadbe00
0x6b000010: deadbe90
0x6b000008: deadbe88
0x6b000004: deadbe80
0x6b000002: deadbe80
0x6b000001: deadbe80

Bits 3-4 are clearly component selection. Bits 5-7 are more interesting. Bits 5 and 7 cause the register to keep its old value. Either they disable the write entirely, or cause the old value to be written back [ie. mov $a0 $a0]. By changing the destination register, we learn it's the former.

How about bit 6?

0x6b000040: 00000000
0x6b07ff5f: 00000000
0x6b040040: 00000000
0x6b020040: 00000000
0x6b010040: 00000000
0x6b008040: 00000000
0x6b004040: 00000000
0x6b002040: 00000000
0x6b001040: 00000000
0x6b000840: 00000000
0x6b000440: 00000000
0x6b000240: 00000000
0x6b000140: 00000000
0x6b0000c0: 18e5e4f1
0x6b000060: deadbe20
0x6b000050: 00000000
0x6b000048: 00000000
0x6b000044: 00000000
0x6b000042: 00000000
0x6b000041: 00000000

Ok, looks like we managed to read $r too. So 0x6b seems to be a more general "read something" instruction. A load instruction. Let's try some other values at bits 5-7:

0x6b000000: deadbe80
0x6b000020: deadbe00
0x6b000040: 00000000
0x6b000060: deadbe20
0x6b000080: deadbe00
0x6b0000a0: deadbe40
0x6b0000c0: 18e5e4f1
0x6b0000e0: deadbe00

Seems they're source selection, with values 001, 100, 111 being invalid. 000 is $v, 010 seems to be something initialised to 0, 011 is $r, 101 is whatever is at 0xfa00+, 110 is something entirely unknown [that's not cleared by a reset]. First, let's mess a bit with 011:

0x6b000060: deadbe20
0x6b004060: deadbe21
0x6b008060: deadbe22
0x6b004068: 00008000
0x6b008068: 00008000
0x6b010068: 00000000
0x6b020068: 00000000
0x6b040068: 00000000
0x6b000070: deadbe00
0x6b004070: deadbe00
0x6b000078: deadbe00
0x6b004078: deadbe00

Nice, there's a $c read there too. Maybe it's in fact worth checking all bit 3-7 combinations:

0x6b000000: deadbe80
0x6b000008: deadbe88
0x6b000010: deadbe90
0x6b000018: deadbe98
0x6b000020: deadbe00
0x6b000028: deadbe00
0x6b000030: deadbe00
0x6b000038: deadbe00
0x6b000040: 00000000
0x6b000048: 00000000
0x6b000050: 00000000
0x6b000058: 00000000
0x6b000060: deadbe20
0x6b000068: 00008000
0x6b000070: deadbe00
0x6b000078: deadbe00
0x6b000078: deadbe00
0x6b000080: deadbe00
0x6b000088: deadbe00
0x6b000090: deadbe00
0x6b000098: deadbe00
0x6b0000a0: deadbe40
0x6b0000a8: deadbe60
0x6b0000b0: 00012204
0x6b0000b8: 00000000
0x6b0000c0: 18e5e4f1
0x6b0000c8: deadbe00
0x6b0000d0: deadbe00
0x6b0000d8: deadbe00
0x6b0000e0: deadbe00
0x6b0000e8: deadbe00
0x6b0000f0: deadbe00
0x6b0000f8: deadbe00

Not that many are valid. 0x12204 can be matched to the value that MMIO reg 0xfb00 has, so that's what it probably loads. As a side note, I suspect 0xfb00+ registers to be the DMA object slots - they have the right amount of bits for that [16-bit instance + 1-bit valid flag], and they got introduced on NV50.

We can notice that the selectors are suspiciously close to the MMIO addresses that the values are available at. That's not a perfect correspondence, though.

One thing I want to get out of the way is the exact addressing of $v registers: am I really selecting the components with b3-4 and index by b14-18, or was my initial guess about MMIO addresses correct? Let's try out the instructions I suspected to be $v/$r clears a few days ago:

0xad07c000:

0000f000: 00000000 deadbea0 deadbec0 deadbee0
0000f010: deadbe81 deadbea1 deadbec1 deadbee1
0000f020: deadbe82 deadbea2 deadbec2 deadbee2
0000f030: deadbe83 deadbea3 deadbec3 deadbee3
0000f040: deadbe84 deadbea4 deadbec4 deadbee4
0000f050: deadbe85 deadbea5 deadbec5 deadbee5
0000f060: deadbe86 deadbea6 deadbec6 deadbee6
0000f070: deadbe87 deadbea7 deadbec7 deadbee7
0000f080: 00000000 deadbea8 deadbec8 deadbee8
0000f090: deadbe89 deadbea9 deadbec9 deadbee9
0000f0a0: deadbe8a deadbeaa deadbeca deadbeea
0000f0b0: deadbe8b deadbeab deadbecb deadbeeb
0000f0c0: deadbe8c deadbeac deadbecc deadbeec
0000f0d0: deadbe8d deadbead deadbecd deadbeed
0000f0e0: deadbe8e deadbeae deadbece deadbeee
0000f0f0: deadbe8f deadbeaf deadbecf deadbeef
0000f100: 00000000 deadbeb0 deadbed0 deadbef0
0000f110: deadbe91 deadbeb1 deadbed1 deadbef1
0000f120: deadbe92 deadbeb2 deadbed2 deadbef2
0000f130: deadbe93 deadbeb3 deadbed3 deadbef3
0000f140: deadbe94 deadbeb4 deadbed4 deadbef4
0000f150: deadbe95 deadbeb5 deadbed5 deadbef5
0000f160: deadbe96 deadbeb6 deadbed6 deadbef6
0000f170: deadbe97 deadbeb7 deadbed7 deadbef7
0000f180: 00000000 deadbeb8 deadbed8 deadbef8
0000f190: deadbe99 deadbeb9 deadbed9 deadbef9
0000f1a0: deadbe9a deadbeba deadbeda deadbefa
0000f1b0: deadbe9b deadbebb deadbedb deadbefb
0000f1c0: deadbe9c deadbebc deadbedc deadbefc
0000f1d0: deadbe9d deadbebd deadbedd deadbefd
0000f1e0: deadbe9e deadbebe deadbede deadbefe
0000f1f0: deadbe9f deadbebf deadbedf deadbeff

0xcb0801c0:

0000f600: deadbe20 bd5b7c40 deadbe22 deadbe23
0000f610: deadbe24 deadbe25 deadbe26 deadbe27
0000f620: deadbe28 deadbe29 deadbe2a deadbe2b
0000f630: deadbe2c deadbe2d deadbe2e deadbe2f
0000f640: deadbe30 deadbe31 deadbe32 deadbe33
0000f650: deadbe34 deadbe35 deadbe36 deadbe37
0000f660: deadbe38 deadbe39 deadbe3a deadbe3b
0000f670: deadbe3c deadbe3d deadbe3e deadbe3f
0000f680: 00008100 00008000 00008000 00008000

The notable things in here:

  1. 3-4 are indeed the component, 14-18 the index
  2. opcodes 0xa.... are likely $v opcodes, 0xc... are $r opcodes
  3. $r opcodes apparently affect $c bit 8, which we haven't seen used yet. Perhaps the $c registers are split into several parts, for $a/$r/$v?

Since 0x6b is getting quite complicated and touches a lot of things we don't know yet, perhaps it's a good idea to move on to opcode 0x6c for now.

6cdeadbe: deadbbd1

A value fairly close to our initial $a values. Since add is a very common opcode we don't know yet, it's a good idea to check for it first. The value is wrong for a $a+$a opcode, but about right for $a+simm11. Let's see:

0xdeadbbd1 - 0xdeadbe1a == -0x249 == 0xfffffdb7

... which is exactly what we have in the "signed immediate" bitfield. Nice. Let's check the $c behavior now.

0xdeadbeef+0xfffffdb7: deadbbd1 flags 000080e5
0x00000000+0x00000000: 00000000 flags 00008002
0x00000001+0x00000000: 00000001 flags 00008000
0xffffffff+0x00000000: ffffffff flags 000080f5
0x80000000+0x00000000: 80000000 flags 00008001
0x7fffffff+0x00000001: 80000000 flags 00008009
0xffffffff+0x00000001: 00000000 flags 0000800a
0xffffffff+0xffffffff: fffffffe flags 000080f5
0x80000000+0xffffffff: 7fffffff flags 000080fc
0x00000000+0xffffffff: ffffffff flags 000080fd
0x00010000+0xffffffff: 0000ffff flags 00008000
0x01000000+0xffffffff: 00ffffff flags 000080fc
0x00100000+0xffffffff: 000fffff flags 000080cc
0x00040000+0xffffffff: 0003ffff flags 00008000
0x00080000+0xffffffff: 0007ffff flags 00008080
0x000fffff+0xffffffff: 000ffffe flags 000080cc
0x000fffff+0xffffffff: 000ffffe flags 000080c4
0x000fffff+0x00000000: 000fffff flags 000080c4

... well, damn. No sane carry flag. No sane overflow flag. Though bit 3 looks like it could be some sort of carry/overflow around bit 19 of the result, but it's really too crappy for me to justify REing precisely right now.

On to 0x75. 0x75deadbe gives 0xadbebe1b. Setting both $a22 and $a26 to 0 doesn't change that. Setting $a27 to 0 changes it to 0xadbe0000. Sounds obvious - it's the sethi instruction. A few tests on various values verify that.

Let's try some less common opcodes. Next most common one is 0x7e. 0x7edeadbe gives 0x5b7c3400. Setting $a26 to 0 gives a zero result. Setting $a22 to 0 doesn't change the results. Using opcode 0x7e000022 on 0xdeadbe00 results in 0x0deadbe0. I'm just going to guess it's the same opcode as 0x6e, except it does a logical (unsigned) shift.

0x68. 0x68deadbe gives 0xdeadbe1a. 0x68deadbe with $a26 set to 0 gives 0xfffffdb7. Changing $a22 doesn't affect the result. Bits 0-2 select $c output like in the previous instructions. So, another instruction of $a, $c, $a, simm11 form.

0x00000000 op 0x00000000: 0x00000000 flags 00008002
0x00000000 op 0x00000001: 0x00000000 flags 00008002
0x00000001 op 0x00000000: 0x00000000 flags 00008002
0x00000001 op 0x00000001: 0x00000001 flags 00008000
0x000000aa op 0x00000055: 0x00000055 flags 00008000
0x00000055 op 0x000000aa: 0x00000055 flags 00008000
0x00000234 op 0x00000123: 0x00000123 flags 00008000
0x00000122 op 0x00000123: 0x00000122 flags 00008000
0x00000123 op 0x00000123: 0x00000123 flags 00008000
0x00000124 op 0x00000123: 0x00000123 flags 00008000
0xffff0000 op 0x00000123: 0xffff0000 flags 000080f5

Clearly a signed min operation.

0x64. Again, we easily conclude a $a, $c, $a, simm11 form a few tests.

0x00000000 op 0x00000000: 0x00000000 flags 00008002
0x00000001 op 0x00000000: 0x00000001 flags 00008000
0x00000000 op 0x00000001: 0x00000001 flags 00008000
0x00000001 op 0x00000001: 0x00000001 flags 00008000
0x00000456 op 0x00000123: 0x00000577 flags 00008000

Ok, an or instruction.

Let's try 0x63 now, it might be the xor instruction... it has opcode close to and/or. Sure enough:

0x00000000 op 0x00000000: 0x00000000 flags 00008002
0x00000001 op 0x00000000: 0x00000001 flags 00008000
0x00000000 op 0x00000001: 0x00000001 flags 00008000
0x00000001 op 0x00000001: 0x00000000 flags 00008002
0x00000456 op 0x00000123: 0x00000575 flags 00008000

0x69. Same instruction format again.

0x00000000 op 0x00000000: 0x00000000 flags 00008002
0x00000000 op 0x00000001: 0x00000001 flags 00008000
0x00000001 op 0x00000000: 0x00000001 flags 00008000
0x00000001 op 0x00000001: 0x00000001 flags 00008000
0x00000001 op 0x00000002: 0x00000002 flags 00008000
0x000000aa op 0x00000055: 0x000000aa flags 00008000
0x00000055 op 0x000000aa: 0x000000aa flags 00008000
0x00000234 op 0x00000123: 0x00000234 flags 00008000
0x00000122 op 0x00000123: 0x00000123 flags 00008000
0x00000123 op 0x00000123: 0x00000123 flags 00008000
0x00000124 op 0x00000123: 0x00000124 flags 00008000
0xffff0000 op 0x00000123: 0x00000123 flags 00008008

Signed max.

Time to go back to 0x6a, I suppose. It's highly likely that it's the inverse of 0x6b. Easy enough to check. 0x6a000000:

0000f000: deadbe00 deadbea0 deadbec0 deadbee0
0000f010: deadbe81 deadbea1 deadbec1 deadbee1
0000f020: deadbe82 deadbea2 deadbec2 deadbee2
0000f030: deadbe83 deadbea3 deadbec3 deadbee3
0000f040: deadbe84 deadbea4 deadbec4 deadbee4
[...]

Good. To check the source/destination bitfields, let's try 0x6a080000.

0000f000: deadbe80 deadbe00 deadbec0 deadbee0
0000f010: deadbe81 deadbea1 deadbec1 deadbee1
0000f020: deadbe82 deadbea2 deadbec2 deadbee2
0000f030: deadbe83 deadbea3 deadbec3 deadbee3
[...]

So destination is still at 19-23, source at 14-18. Good. Let's try to write other stuff, too. 0x6a004060:

0000f600: deadbe01 deadbe21 deadbe22 deadbe23

$r works. 0x6a004068:

0000f680: 00008000 00008000 00008000 00008000

Writing $c file apparently doesn't work, just like for MMIO. Too bad. 0x6a0040a0:

0000fa00: deadbe01 deadbe41 deadbe42 deadbe43

0x6a0040a8:

0000fa80: deadbe01 deadbe61 deadbe62 deadbe63

0x6a0040b0:

0000fb00: 0001be01 00012204 00012204 00017391

0x6a0040c0 followed by a corresponding load likewise confirms that writing that space works, with full 32-bit range. So does 0x6a0040b8. As for 010XX spaces... hmm, they look irregular.

Change of plans, I'll just scan the *whole* space - all combinations of bits 3-7 and the source/destination fields. Also, since we found a correspondence between MMIO addresses and the selector/register pair, let's print the corresponding MMIO reg while we're at it. The first big column is the initial value, the second column is the value after writing 0xffffffff, the third column is the value after writing 0. The value before the slash is obtained by in-program read, the value after the slash is obtained by a MMIO read.

Here comes the test code: test2.c

6a000000/f000: deadbe80/deadbe80 ffffffff/ffffffff 00000000/00000000
6a080000/f004: deadbea0/deadbea0 ffffffff/ffffffff 00000000/00000000
6a100000/f008: deadbec0/deadbec0 ffffffff/ffffffff 00000000/00000000
6a180000/f00c: deadbee0/deadbee0 ffffffff/ffffffff 00000000/00000000
[...]
6ad80018/f1ec: deadbefe/deadbefe ffffffff/ffffffff 00000000/00000000
6ae00018/f1f0: deadbe9f/deadbe9f ffffffff/ffffffff 00000000/00000000
6ae80018/f1f4: deadbebf/deadbebf ffffffff/ffffffff 00000000/00000000
6af00018/f1f8: deadbedf/deadbedf ffffffff/ffffffff 00000000/00000000
6af80018/f1fc: deadbeff/deadbeff ffffffff/ffffffff 00000000/00000000

Vector registers. Nothing to see here.

6a000020/f200: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a080020/f204: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a100020/f208: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a180020/f20c: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
[...]
6ae00038/f3f0: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6ae80038/f3f4: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af00038/f3f8: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af80038/f3fc: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000

Just invalid.

6a000040/f400: 00000000/00000000 00000000/00000000 00000000/00000000
6a080040/f404: 00000000/00000000 00000000/00000000 00000000/00000000
6a100040/f408: 80001000/80001000 80001000/80001000 80001000/80001000
6a180040/f40c: 00000000/00000000 ffffffff/ffffffff 00000000/00000000
6a200040/f410: 00000000/00000000 00000003/00000003 00000000/00000000
6a280040/f414: 00000000/00000000 00000000/00000000 00000000/00000000
6a300040/f418: 00000000/00000000 00000000/00000000 00000000/00000000
6a380040/f41c: 00000000/ffffffff 00000000/ffffffff 00000000/ffffffff
6a400040/f420: 00000000/00000001 00000000/00000001 00000000/00000001
6a480040/f424: 00000000/000051d2 deadbe00/0000ffff deadbe00/00000000
6a500040/f428: 00000000/00000000 00000000/00000000 00000000/00000000
6a580040/f42c: 00000004/00000004 00000004/00000004 00000004/00000004
6a600040/f430: 00000000/00000044 00000000/00000044 00000000/00000044
6a680040/f434: 00001000/00001000 00001000/00001000 00001000/00001000
6a700040/f438: 00000000/00000111 00000000/00000111 00000000/00000111
6a780040/f43c: 00000000/00000000 00000000/00000000 00000000/00000000
6a800040/f440: 00000000/00000000 00000000/ffffffff 00000000/00000000
6a880040/f444: 00000000/c0000000 00000000/c0000000 00000000/c0000000
6a900040/f448: 00000000/df000000 00000000/df000000 00000000/df000000
6a980040/f44c: 00000000/4f000000 00000000/4f000000 00000000/4f000000
6aa00040/f450: 00000000/bf000000 00000000/bf000000 00000000/bf000000
6aa80040/f454: 00000000/ef000000 00000000/ef000000 00000000/ef000000
6ab00040/f458: 00000000/00000000 00000000/00000000 00000000/00000000
6ab80040/f45c: 00000000/00000000 00000000/00000000 00000000/00000000
6ac00040/f460: 00000000/00000000 00000000/00000000 00000000/00000000
6ac80040/f464: 00000000/01200000 00000000/01200000 00000000/01200000
6ad00040/f468: 00000000/01203000 00000000/01203000 00000000/01203000
6ad80040/f46c: 00000000/01202000 00000000/01202000 00000000/01202000
6ae00040/f470: 01202000/01202000 01202000/01202000 01202000/01202000
6ae80040/f474: 00000000/00000011 00000000/00000011 00000000/00000011
6af00040/f478: 01563505/016020b6 00000003/0009eb62 00000004/0009eafa
6af80040/f47c: 00000000/00000000 00000000/00000000 00000000/00000000

Compare this with the MMIO scan:

...
00f404: 00000000 ffffffff 00000000 *
00f408: 00000000 ffffffff 00000000 *
00f40c: 00000000 ffffffff 00000000 *
00f410: 00000000 00000003 00000000 *
...
00f41c: 00000000 ffffffff 00000000 *
00f420: 00000000 00000001 00000000 *
00f424: 00000000 0000ffff 00000000 *
...
00f430: 00000044 00000044 00000044
...
00f438: 00000111 00000111 00000000 *
...
00f440: 00000000 ffffffff 00000000 *
00f444: 8d110fdc ffffffff 00000000 *
00f448: df000000 ffffffff 00000000 *
00f44c: 4f000000 ffffffff 00000000 *
00f450: bf000000 ffffffff 00000000 *
00f454: ef000000 ffffffff 00000000 *
...
00f464: 00000000 ffffffe3 00000000 *
00f468: 00000000 ffffffe3 00000000 *
00f46c: 00000000 ffffffe3 00000000 *
...
00f474: 00000000 00000011 00000000 *
00f478: 3722c767 0000003d 0000003e *
...

So, f408-f410 are obviously common between ISA and MMIO, though f408 isn't ISA-writable for some reason. f41c-f420 aren't accessible by ISA. f424, which we know to be the exit status reg, is writable by ISA but causes immediate exit if written - makes sense, sort of. f42c, f434 appear to be RO and readable by both. f430,f438,f440-f46c,f474 are again MMIO-only. f470, which is apparently the current code base, is readable by both. f478, the clock, is RW by both MMIO and ISA. Which gives us a way to get an idea of the passage of time inside the VP. Seems about 4 clocks pass between the write and the read, which are 15 words [3 bundles] apart. That gives a throughput of about 1 bundle per cycle, but may be wrong for a number of reasons. Definitely something to look at later.

6a000048/f480: 00000000/00000000 00000000/00000000 00000000/00000000
6a080048/f484: 00000000/00000000 00000000/00000000 00000000/00000000
6a100048/f488: 00000000/00000000 00000000/00000000 00000000/00000000
6a180048/f48c: 00000000/00000000 00000000/00000000 00000000/00000000
6a200048/f490: 00000000/00000000 00000000/00000000 00000000/00000000
6a280048/f494: 00000000/00000000 00000000/00000000 00000000/00000000
6a300048/f498: 00000000/00000000 00000000/00000000 00000000/00000000
6a380048/f49c: 00000000/00000000 00000000/00000000 00000000/00000000
6a400048/f4a0: 00000000/00000000 00000000/00000000 00000000/00000000
6a480048/f4a4: 00000000/00000000 00000000/00000000 00000000/00000000
6a500048/f4a8: 00000000/00000000 00000000/00000000 00000000/00000000
6a580048/f4ac: 00000000/00000000 00000000/00000000 00000000/00000000
6a600048/f4b0: 00000000/00000000 00000000/00000000 00000000/00000000
6a680048/f4b4: 00000000/00000000 00000000/00000000 00000000/00000000
6a700048/f4b8: 00000000/00000000 00000000/00000000 00000000/00000000
6a780048/f4bc: 00000000/00000000 00000000/00000000 00000000/00000000
6a800048/f4c0: 00000000/00000000 00000000/00000000 00000000/00000000
6a880048/f4c4: 00000000/00000000 00000000/00000000 00000000/00000000
6a900048/f4c8: 01202000/01202000 ffffffe3/ffffffe3 00000000/00000000
6a980048/f4cc: 00000000/00000000 00000000/00000000 00000000/00000000
6aa00048/f4d0: 00000000/00000000 00000000/00000000 00000000/00000000
6aa80048/f4d4: 00000000/00000000 00000000/00000000 00000000/00000000
6ab00048/f4d8: 00000000/00000000 00000000/00000000 00000000/00000000
6ab80048/f4dc: 00000000/00000000 00000000/00000000 00000000/00000000
6ac00048/f4e0: 00000000/00000000 00000000/00000000 00000000/00000000
6ac80048/f4e4: 00000000/00000000 00000000/00000000 00000000/00000000
6ad00048/f4e8: 00000000/00000000 00000000/00000000 00000000/00000000
6ad80048/f4ec: 00000000/00000000 00000000/00000000 00000000/00000000
6ae00048/f4f0: 00000000/00000000 00000000/00000000 00000000/00000000
6ae80048/f4f4: 00000000/00000000 00000000/00000000 00000000/00000000
6af00048/f4f8: 00000000/00000000 00000000/00000000 00000000/00000000
6af80048/f4fc: 00000000/00000000 00000000/00000000 00000000/00000000

Compared with:

...
00f4c8: 00000000 ffffffe3 00000000 *
...

Well... yeah. Not sure what that reg is, but it's pretty clear it can be accessed by both MMIO and ISA.

6a000050/f500: 00000000/00000000 00000000/00000000 00000000/00000000
6a080050/f504: 00000000/00000000 00000000/00000000 00000000/00000000
6a100050/f508: 00000000/00000000 00000000/00000000 00000000/00000000
6a180050/f50c: 00000000/00000000 00000000/00000000 00000000/00000000
6a200050/f510: df000000/df000000 df000000/df000000 df000000/df000000
6a280050/f514: 4fffffff/4f000000 4fffffff/4f000000 4fffffff/4f000000
6a300050/f518: bf000007/bf000000 bf000007/bf000000 bf000007/bf000000
6a380050/f51c: ef0001ff/ef000000 ef0001ff/ef000000 ef0001ff/ef000000
6a400050/f520: 00000014/0000001c deadbe00/000fffff deadbe00/00000000
6a480050/f524: 00000008/00000008 deadbe00/0000000f 00000002/00000002
6a500050/f528: 00000000/00000000 00000000/00000000 00000000/00000000
6a580050/f52c: 00000000/00000000 0000000f/0000000f 00000000/00000000
6a600050/f530: 00000000/00000000 000fffff/000fffff 00000000/00000000
6a680050/f534: 00000000/00000000 000fffff/000fffff 00000000/00000000
6a700050/f538: 00000000/00000000 000fffff/000fffff 00000000/00000000
6a780050/f53c: 00000000/00000000 000fffff/000fffff 00000000/00000000
6a800050/f540: 00000000/00000000 00000111/00000111 00000000/00000000
6a880050/f544: 00000000/00000000 00000000/00000000 00000000/00000000
6a900050/f548: 00000000/00000000 00000000/00000000 00000000/00000000
6a980050/f54c: 00000000/00000000 00000000/00000000 00000000/00000000
6aa00050/f550: 00000000/00000000 00000000/00000000 00000000/00000000
6aa80050/f554: 00000000/00000000 00000000/00000000 00000000/00000000
6ab00050/f558: 00000000/00000000 00000000/00000000 00000000/00000000
6ab80050/f55c: 00000000/00000000 00000000/00000000 00000000/00000000
6ac00050/f560: 00000000/00000000 00000000/00000000 00000000/00000000
6ac80050/f564: 00000000/00000000 00000000/00000000 00000000/00000000
6ad00050/f568: 00000000/00000000 00000000/00000000 00000000/00000000
6ad80050/f56c: 00000000/00000000 00000000/00000000 00000000/00000000
6ae00050/f570: 00000000/00000000 00000000/00000000 00000000/00000000
6ae80050/f574: 00000000/00000000 00000000/00000000 00000000/00000000
6af00050/f578: 00000000/00000000 00000000/00000000 00000000/00000000
6af80050/f57c: 00000000/00000000 00000000/00000000 00000000/00000000
...
00f510: df000000 df000000 df000000
00f514: 4f000000 4f000000 4f000000
00f518: bf000000 bf000000 bf000000
00f51c: ef000000 ef000000 ef000000
00f520: 00000000 000fffff 00000000 *
00f524: 00000000 0000000f 00000000 *
...
00f52c: 00000000 0000000f 00000000 *
00f530: 00000000 000fffff 00000000 *
00f534: 00000000 000fffff 00000000 *
00f538: 00000000 000fffff 00000000 *
00f53c: 00000000 000fffff 00000000 *
00f540: 00000000 00000111 00000000 *
...

Same here. Since f520 is the PC reg, it's quite reasonable that writing it causes execution to screw up. Not idea what f524 is, but it's clearly important as well.

6a000058/f580: 00000000/00000000 0000ffff/0000ffff 00000000/00000000
6a080058/f584: 00000000/00000000 0000ffff/0000ffff 00000000/00000000
6a100058/f588: 00000000/00000000 0000ffff/0000ffff 00000000/00000000
6a180058/f58c: 00000000/00000000 0000ffff/0000ffff 00000000/00000000
6a200058/f590: 00000000/00000000 00000000/00000000 00000000/00000000
6a280058/f594: 00000000/00000000 00000000/00000000 00000000/00000000
6a300058/f598: 00000000/00000000 00000000/00000000 00000000/00000000
6a380058/f59c: 00000000/00000000 00000000/00000000 00000000/00000000
6a400058/f5a0: 00000000/00000000 00000000/00000000 00000000/00000000
6a480058/f5a4: 00000000/00000000 00000000/00000000 00000000/00000000
6a500058/f5a8: 00000000/00000000 00000000/00000000 00000000/00000000
6a580058/f5ac: 00000000/00000000 00000000/00000000 00000000/00000000
6a600058/f5b0: 00000000/00000000 00000000/00000000 00000000/00000000
6a680058/f5b4: 00000000/00000000 00000000/00000000 00000000/00000000
6a700058/f5b8: 00000000/00000000 00000000/00000000 00000000/00000000
6a780058/f5bc: 00000000/00000000 00000000/00000000 00000000/00000000
6a800058/f5c0: 00000000/00000000 00000000/00000000 00000000/00000000
6a880058/f5c4: 00000000/00000000 00000000/00000000 00000000/00000000
6a900058/f5c8: 00000000/00000000 00000000/00000000 00000000/00000000
6a980058/f5cc: 00000000/00000000 00000000/00000000 00000000/00000000
6aa00058/f5d0: 00000000/00000000 00000000/00000000 00000000/00000000
6aa80058/f5d4: 00000000/00000000 00000000/00000000 00000000/00000000
6ab00058/f5d8: 00000000/00000000 00000000/00000000 00000000/00000000
6ab80058/f5dc: 00000000/00000000 00000000/00000000 00000000/00000000
6ac00058/f5e0: 00000000/00000000 00000000/00000000 00000000/00000000
6ac80058/f5e4: 00000000/00000000 00000000/00000000 00000000/00000000
6ad00058/f5e8: 00000000/00000000 00000000/00000000 00000000/00000000
6ad80058/f5ec: 00000000/00000000 00000000/00000000 00000000/00000000
6ae00058/f5f0: 00000000/00000000 00000000/00000000 00000000/00000000
6ae80058/f5f4: 00000000/00000000 00000000/00000000 00000000/00000000
6af00058/f5f8: 00000000/00000000 00000000/00000000 00000000/00000000
6af80058/f5fc: 00000000/00000000 00000000/00000000 00000000/00000000
00f580: 00000000 0000ffff 00000000 *
00f584: 00000000 0000ffff 00000000 *
00f588: 00000000 0000ffff 00000000 *
00f58c: 00000000 0000ffff 00000000 *
...

Same here. The whole above range probably deserves a name though. Let's call them $sr0-$sr127.

6a000060/f600: deadbe20/deadbe20 ffffffff/ffffffff 00000000/00000000
6a080060/f604: deadbe21/deadbe21 ffffffff/ffffffff 00000000/00000000
6a100060/f608: deadbe22/deadbe22 ffffffff/ffffffff 00000000/00000000
6a180060/f60c: deadbe23/deadbe23 ffffffff/ffffffff 00000000/00000000
6a200060/f610: deadbe24/deadbe24 ffffffff/ffffffff 00000000/00000000
[...]
6ae00060/f670: deadbe3c/deadbe3c ffffffff/ffffffff 00000000/00000000
6ae80060/f674: deadbe3d/deadbe3d ffffffff/ffffffff 00000000/00000000
6af00060/f678: deadbe3e/deadbe3e ffffffff/ffffffff 00000000/00000000
6af80060/f67c: deadbe3f/deadbe3f ffffffff/ffffffff 00000000/00000000
6a000068/f680: 00008000/00008000 00008000/00008000 00008000/00008000
6a080068/f684: 00008000/00008000 00008000/00008000 00008000/00008000
6a100068/f688: 00008000/00008000 00008000/00008000 00008000/00008000
6a180068/f68c: 00008000/00008000 00008000/00008000 00008000/00008000
6a200068/f690: 00000000/00000000 00000000/00000000 00000000/00000000
6a280068/f694: 00000000/00000000 00000000/00000000 00000000/00000000
6a300068/f698: 00000000/00000000 00000000/00000000 00000000/00000000
6a380068/f69c: 00000000/00000000 00000000/00000000 00000000/00000000
6a400068/f6a0: 00000000/00000000 00000000/00000000 00000000/00000000
[...]
6ad80078/f7ec: deadbe00/deadbe1b deadbe00/deadbe1b deadbe00/deadbe1b
6ae00078/f7f0: deadbe00/deadbe1c deadbe00/deadbe1c deadbe00/deadbe1c
6ae80078/f7f4: deadbe00/deadbe1d deadbe00/deadbe1d deadbe00/deadbe1d
6af00078/f7f8: deadbe00/deadbe1e deadbe00/deadbe1e deadbe00/deadbe1e

$r, $c. Nothing to say here.

6a000080/f800: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a080080/f804: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a100080/f808: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a180080/f80c: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
[...]
6ae00098/f9f0: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6ae80098/f9f4: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af00098/f9f8: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af80098/f9fc: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000

Another invalid range.

6a0000a0/fa00: deadbe40/deadbe40 ffffffff/ffffffff 00000000/00000000
6a0800a0/fa04: deadbe41/deadbe41 ffffffff/ffffffff 00000000/00000000
6a1000a0/fa08: deadbe42/deadbe42 ffffffff/ffffffff 00000000/00000000
6a1800a0/fa0c: deadbe43/deadbe43 ffffffff/ffffffff 00000000/00000000
[...]
6ad800a8/faec: deadbe7b/deadbe7b ffffffff/ffffffff 00000000/00000000
6ae000a8/faf0: deadbe7c/deadbe7c ffffffff/ffffffff 00000000/00000000
6ae800a8/faf4: deadbe7d/deadbe7d ffffffff/ffffffff 00000000/00000000
6af000a8/faf8: deadbe7e/deadbe7e ffffffff/ffffffff 00000000/00000000
6af800a8/fafc: deadbe7f/deadbe7f ffffffff/ffffffff 00000000/00000000

Another register file or what... well, at least it's small. $x0-$x63 for now.

6a0000b0/fb00: 0001be01/0001be01 0001ffff/0001ffff 00000000/00000000
6a0800b0/fb04: 00012204/00012204 0001ffff/0001ffff 00000000/00000000
6a1000b0/fb08: 00012204/00012204 0001ffff/0001ffff 00000000/00000000
6a1800b0/fb0c: 00017391/00017391 0001ffff/0001ffff 00000000/00000000
6a2000b0/fb10: 0000a0bd/0000a0bd 0001ffff/0001ffff 00000000/00000000
6a2800b0/fb14: 0000d72e/0000d72e 0001ffff/0001ffff 00000000/00000000
6a3000b0/fb18: 0001c59f/0001c59f 0001ffff/0001ffff 00000000/00000000
6a3800b0/fb1c: 0001473f/0001473f 0001ffff/0001ffff 00000000/00000000
6a4000b0/fb20: 00000000/00000000 0001ffff/00000000 00000000/00000000
6a4800b0/fb24: 00000000/00000000 0001ffff/00000000 00000000/00000000
6a5000b0/fb28: 00000000/00000000 0001ffff/00000000 00000000/00000000
6a5800b0/fb2c: 00000000/00000000 0001ffff/00000000 00000000/00000000
[...]
6ae000b0/fb70: 00000000/00000000 0001ffff/00000000 00000000/00000000
6ae800b0/fb74: 00000000/00000000 0001ffff/00000000 00000000/00000000
6af000b0/fb78: 00000000/00000000 0001ffff/00000000 00000000/00000000
6af800b0/fb7c: 00000000/00000000 0001ffff/00000000 00000000/00000000

The suspected DMA objects... ok, $d0-$d7. The rest are probably just aliases and can be ignored.

6a0000b8/fb80: 00000000/00000000 00000000/00000000 00000000/00000000
6a0800b8/fb84: 00000000/00000000 00000000/00000000 00000000/00000000
6a1000b8/fb88: 00000000/00000000 00000000/00000000 00000000/00000000
6a1800b8/fb8c: 00000000/00000000 00000000/00000000 00000000/00000000
6a2000b8/fb90: 00000000/00000000 00000000/00000000 00000000/00000000
6a2800b8/fb94: 00000000/00000000 00000000/00000000 00000000/00000000
6a3000b8/fb98: 00000000/00000000 00000000/00000000 00000000/00000000
6a3800b8/fb9c: 00000000/00000000 00000000/00000000 00000000/00000000
6a4000b8/fba0: 00000000/00000000 00000000/00000000 00000000/00000000
6a4800b8/fba4: 00000000/00000000 00000000/00000000 00000000/00000000
6a5000b8/fba8: 00000000/00000000 00000000/00000000 00000000/00000000
6a5800b8/fbac: 00000000/00000000 00000000/00000000 00000000/00000000
6a6000b8/fbb0: 00000000/00000000 00000000/00000000 00000000/00000000
6a6800b8/fbb4: 00000000/00000000 00000000/00000000 00000000/00000000
6a7000b8/fbb8: 00000000/00000000 00000000/00000000 00000000/00000000
6a7800b8/fbbc: 00000000/00000000 00000000/00000000 00000000/00000000
6a8000b8/fbc0: 00000000/00000000 00000000/00000000 00000000/00000000
6a8800b8/fbc4: 00000000/00000000 00000000/00000000 00000000/00000000
6a9000b8/fbc8: 00000000/00000000 00000000/00000000 00000000/00000000
6a9800b8/fbcc: 00000000/00000000 00000000/00000000 00000000/00000000
6aa000b8/fbd0: 00000000/00000000 00000000/00000000 00000000/00000000
6aa800b8/fbd4: 00000000/00000000 00000000/00000000 00000000/00000000
6ab000b8/fbd8: 00000000/00000000 00000000/00000000 00000000/00000000
6ab800b8/fbdc: 00000000/00000000 00000000/00000000 00000000/00000000
6ac000b8/fbe0: 00000000/00000000 00000000/00000000 00000000/00000000
6ac800b8/fbe4: 00000000/00000000 00000000/00000000 00000000/00000000
6ad000b8/fbe8: 00000000/00000000 00000000/00000000 00000000/00000000
6ad800b8/fbec: 00000000/00000000 00000000/00000000 00000000/00000000
6ae000b8/fbf0: 00000000/00000000 00000000/00000000 00000000/00000000
6ae800b8/fbf4: 00000000/00000000 00000000/00000000 00000000/00000000
6af000b8/fbf8: 00000000/00000000 00000000/00000000 00000000/00000000
6af800b8/fbfc: 00000000/00000000 00000000/00000000 00000000/00000000

Strange.

6a0000c0/fc00: 18e5e4f1/00000000 ffffffff/00000000 00000000/00000000
6a0800c0/fc04: deadbe01/00000000 ffffffff/00000000 00000000/00000000
6a1000c0/fc08: dc04ac4b/00000000 ffffffff/00000000 00000000/00000000
6a1800c0/fc0c: d75de592/00000000 ffffffff/00000000 00000000/00000000
6a2000c0/fc10: dd0c5699/00000000 ffffffff/00000000 00000000/00000000
6a2800c0/fc14: 2858df40/00000000 ffffffff/00000000 00000000/00000000
6a3000c0/fc18: 775a8fbb/00000000 ffffffff/00000000 00000000/00000000
6a3800c0/fc1c: 46c77e55/00000000 ffffffff/00000000 00000000/00000000
6a4000c0/fc20: 961983e7/00000000 ffffffff/00000000 00000000/00000000
6a4800c0/fc24: 29973d93/00000000 ffffffff/00000000 00000000/00000000
6a5000c0/fc28: 5638f188/00000000 ffffffff/00000000 00000000/00000000
6a5800c0/fc2c: 37488c0d/00000000 ffffffff/00000000 00000000/00000000
6a6000c0/fc30: 3b050db0/00000000 ffffffff/00000000 00000000/00000000
6a6800c0/fc34: 0b73c171/00000000 ffffffff/00000000 00000000/00000000
6a7000c0/fc38: da501dfc/00000000 ffffffff/00000000 00000000/00000000
6a7800c0/fc3c: 78f95b5a/00000000 ffffffff/00000000 00000000/00000000
6a8000c0/fc40: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a8800c0/fc44: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a9000c0/fc48: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a9800c0/fc4c: 00000000/00000000 ffffffff/00000000 00000000/00000000
6aa000c0/fc50: 00000000/00000000 ffffffff/00000000 00000000/00000000
6aa800c0/fc54: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ab000c0/fc58: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ab800c0/fc5c: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ac000c0/fc60: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ac800c0/fc64: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ad000c0/fc68: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ad800c0/fc6c: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ae000c0/fc70: 00000000/00000000 ffffffff/00000000 00000000/00000000
6ae800c0/fc74: 00000000/00000000 ffffffff/00000000 00000000/00000000
6af000c0/fc78: 00000000/00000000 ffffffff/00000000 00000000/00000000
6af800c0/fc7c: 00000000/00000000 ffffffff/00000000 00000000/00000000

Hmm, rather few of these... $y0-$y15, whatever that is.

6a0000c8/fc80: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a0800c8/fc84: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a1000c8/fc88: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a1800c8/fc8c: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a2000c8/fc90: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6a2800c8/fc94: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
[...]
6ae000f8/fff0: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6ae800f8/fff4: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af000f8/fff8: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000
6af800f8/fffc: deadbe00/00000000 deadbe00/00000000 deadbe00/00000000

And that'd be it.

Or would it? Let's redo the scan, using the channel switch code instead of method code. I'll only list the nonobvious changes.

6a100040/f408: 00000000/80001000 00000000/80001000 00000000/80001000
6a180040/f40c: 00000000/00000000 ffffffff/3fffffff 00000000/00000000

These register may change as a result of completion of channel switch code. We'll figure them out when we get to REing the exact channel switch and method call sequences.

6af80040/f47c: 00000004/00000000 00000004/00000000 00000004/00000000

Another RO reg that changes after channel switch is done?

6a0000b8/fb80: 00000000/00000000 7fffffff/00000000 00000000/00000000
6a0800b8/fb84: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a1000b8/fb88: 00000000/00000000 7fffffff/00000000 00000000/00000000
6a1800b8/fb8c: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a2000b8/fb90: 00000000/00000000 7fffffff/00000000 00000000/00000000
[...]
6ae000b8/fbf0: 00000000/00000000 7fffffff/00000000 00000000/00000000
6ae800b8/fbf4: 00000000/00000000 ffffffff/00000000 00000000/00000000
6af000b8/fbf8: 00000000/00000000 7fffffff/00000000 00000000/00000000
6af800b8/fbfc: 00000000/00000000 ffffffff/00000000 00000000/00000000

... ok, that was really unexpected. Let's see if there's any aliasing between them. This time I'll always write to reg 0/1.

6a0000b8/fb80: 00000000/00000000 7fffffff/00000000 00000000/00000000
6a0800b8/fb84: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a1000b8/fb88: 00000000/00000000 7fffffff/00000000 00000000/00000000
6a1800b8/fb8c: 00000000/00000000 ffffffff/00000000 00000000/00000000
6a2000b8/fb90: 00000000/00000000 7fffffff/00000000 00000000/00000000
[...]
6ae000b8/fbf0: 00000000/00000000 7fffffff/00000000 00000000/00000000
6ae800b8/fbf4: 00000000/00000000 ffffffff/00000000 00000000/00000000
6af000b8/fbf8: 00000000/00000000 7fffffff/00000000 00000000/00000000
6af800b8/fbfc: 00000000/00000000 ffffffff/00000000 00000000/00000000

So that's really just two registers, aliased till the end. $z0 and $z1.

That'd give us the complete set of regs accessible by 0x6a/0x6b, though we still have no idea what they are in many cases.

That's almost all. There remains the issue of unknown bits in 0x6a/0x6b opcodes. Let's see how they're used in nvidia microcode:

00000000: 6a00005f mov $sr48 $a0 [unknown: 00000007]
00000000: 6a000067 mov $r0 $a0 [unknown: 00000007]
00000000: 6a00405f mov $sr48 $a1 [unknown: 00000007]
00000000: 6a00805f mov $sr48 $a2 [unknown: 00000007]
00000001: 6a00c05f mov $sr48 $a3 [unknown: 00000007]
00000001: 6a01405f mov $sr48 $a5 [unknown: 00000007]
[...]
0000008e: 6afcc067 mov $r31 $a19 [unknown: 00000007]
0000008e: 6afd8007 mov $v31 0 $a22 [unknown: 00000007]
0000008e: 6afd800f mov $v31 0x1 $a22 [unknown: 00000007]
0000008f: 6afd8017 mov $v31 0x2 $a22 [unknown: 00000007]
0000008f: 6afd801f mov $v31 0x3 $a22 [unknown: 00000007]
0000008f: 6afdc067 mov $r31 $a23 [unknown: 00000007]
0000008f: 6afe4067 mov $r31 $a25 [unknown: 00000007]

And the same for 0x6b. Everything has 111 in the low 3 bits, 0 in the remaining bits. Looks like $c setting. We can check if these insns really set $c by writing something to a $c register with an add instruction and see if it changes afterwards.

Well, it didn't. For 0x6a as well as for 0x6b. Seems like that field is actually ignored, but nvidia assembler just sets it for consistency.

Things to note now:

  • we have figured out all 0x6X and 0x7X opcodes used by nvidia. All except the mov to/from other register files have $a, maybe $c, and an immediate as arguments. So, the guess about high bits being opcode class and top 8 bits determining the opcodes were right. However, we haven't seen any instruction with two $a sources, nor a load/store instruction using register-based addressing (I wouldn't rule out 0x6a/0x6b accessing the data store, though). This means there must be another opcode class for that.
  • we have loads and loads of register files now:
  • $a0-$a31, 32-bit
  • $r0-$r31, 32-bit
  • $v0-$v31, 128-bit
  • $sr0-$sr127, 32-bit, though few of the indices are actually valid
  • $c0-$c7, ???-bit [at most 32-bit, at least 16-bit]
  • $x0-$x63, 32-bit
  • $d0-$d7, 17-bit
  • $y0-$y15, 32-bit
  • $z0 31-bit and $z1 32-bit - only available in channel switch code for some reason.

Here comes the updated opcode list:

op NV44 NV50
0x04 1026 -
0x0b 26 2
0x0c 96 -
0x0d 288 -
0x0f 526 255
0x24 364 336
0x40 - 766
0x41 1439 1091
0x42 4233 3156
0x45 3 3
0x48 147 123
0x4c 4999 3653
0x4d 1878 1217
0x4f 10747 4331 nop?
0x5e 50 8
0x62 3390 2610 and $a [$c] $a simm11
0x63 48 10 xor $a [$c] $a simm11
0x64 348 266 or $a [$c] $a simm11
0x65 7634 4929 mov $a simm19
0x68 390 288 min $a [$c] $a simm11
0x69 236 192 max $a [$c] $a simm11
0x6a 3359 2640 mov $a to another register file
0x6b 3203 2516 mov $a from another register file
0x6c 3045 2424 add $a [$c] $a simm11
0x6e 8013 6040 sar $a [$c] $a simm11
0x75 3323 2415 sethi $a imm16
0x7e 534 450 shr $a [$c] $a simm11
0x80 33 25
0x81 1 1
0x82 16 16
0x83 26 26
0x84 199 188
0x85 116 92
0x86 111 106
0x87 48 42
0x8a 26 10
0x8b 7 7
0x8c 27 27
0x8d 3 3
0x8f 13 13
0x90 8 -
0x91 48 48
0x92 13 4
0x94 139 139
0x95 272 217
0x97 147 142
0x98 21 5
0x99 21 5
0x9b 740 555
0x9c 43 33
0x9d 14 4
0x9f 76 76
0xa4 10 10
0xa5 3 3
0xaa 14 14
0xab 29 29
0xac 129 129
0xad 936 822
0xae 14 14
0xb1 - 20
0xb4 234 180
0xb5 276 200
0xb6 388 290
0xb7 128 96
0xb8 5 -
0xbc 55 55
0xbe 35 35
0xbf 11118 5148 nop?
0xc0 65 35
0xc1 67 67
0xc2 136 8
0xc3 253 217
0xc4 50 50
0xc5 49 49
0xc6 134 6
0xc7 724 141
0xc8 640 481
0xc9 618 466
0xca 528 502
0xcb 2396 1550
0xcc 2486 1483
0xcd 1960 1058
0xce 449 372
0xcf 507 369
0xd0 928 869
0xd1 655 575
0xd2 176 368
0xd3 359 414
0xd4 1021 844
0xd5 963 694
0xd6 142 409
0xd8 712 484
0xd9 98 93
0xda 4221 2967
0xdb - 37
0xdc 800 552
0xdd 204 116
0xde 7836 26102
0xdf 9333 4072 nop?
0xe0 2897 2330 branch ?
0xe1 1036 913 branch ?
0xe2 1707 1188 branch ?
0xe3 120 89
0xe4 2020 695 branch ?
0xe6 92 84
0xe8 694 552
0xea 248 1 abra
0xef 23932 13521 nop?
0xf0 161 329
0xff 918 642 exit [intr] imm16

That's all for this session... it's already been dragging on for three days, time to publish. In the next session, we'll aim for the remaining $a opcodes.

Elapsed time: 7h.

Currently unrated

Comments

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 3 weeks ago

Comment awaiting approval 2 months, 2 weeks ago

Comment awaiting approval 2 months, 2 weeks ago

Comment awaiting approval 2 months, 2 weeks ago

Comment awaiting approval 2 months, 2 weeks ago

New Comment

required

required (not published)

optional

required