<alyssa>
This highly suggests Midgard opcodes are actually 6-bits
<alyssa>
(not 8-bit)
<alyssa>
and the bottom 2-bits are a separate field indicating type
<alyssa>
This offers one big insight: the colorbuffer ops
<alyssa>
on t760+, we have 0xB9 and 0xBA to load as fp16 or raw u32 respectively
<alyssa>
If you decode as a 6-bit GL load, that's the same op but fp16 or u32 respectively
<alyssa>
This implies the existence of an 0xB8 op that loads as fp32, of course.
<alyssa>
More to the point, on T720 we only know 0x9D to load as fp16, we don't know how to do raw loads.
<alyssa>
The above implies the existence of a 0x9E opcode for u32
<alyssa>
which, well, all of the above is theory
<alyssa>
Guess I better spin that through CI ;_0
<alyssa>
wait a minute
<icecream95>
alyssa: I remember seeing 0x9e with the offline shader compiler when targeting t720
<alyssa>
icecream95: maybe I'm not going crazy then! :)
<alyssa>
As for ALU ops, they also show a quad-split but not as cleanly as load/store
<icecream95>
0x9C is midgard_op_ld_color_buffer_u8_as_fp32_old, 0xB8 is midgard_op_ld_color_buffer_u8_as_fp32
<alyssa>
icecream95: right, that's implied from the above but I've never actually seen that in the wild since you don't need it
<alyssa>
(AFAIK you'd only want to load as fp32 if you're working with a FP32 render target, but then there's no conversion so you can just use the u32 op)
<alyssa>
for blend shaders, anyway
<alyssa>
Probably this interacts with the framebuffer_fetch extensions
<icecream95>
alyssa: I remember seeing it for ARM_shader_framebuffer_fetch
<alyssa>
there you go then :)
<alyssa>
I'm sorta curious how perf differs between ARM_shader_framebuffer_fetch and blend shaders.
<alyssa>
IIRC vc4 doesn't have any hw blend and does blending in the frag shader with a fb_fetch
<icecream95>
vc4 emulates a lot of features in the shader
cwabbott has quit [Ping timeout: 256 seconds]
<alyssa>
It's way more shader variants, which is probably the bigger issue (for compile-times as well as i-cache)
<alyssa>
yes, that's true :)
<alyssa>
I'm under the impression anything in ES3 at least should be theoretically doable w/o keying but we all know that's not how it plays out in practice :>
<daniels>
icecream95: huh, what's with the build-system changes?
<icecream95>
daniels: could you be more specific?
<daniels>
icecream95: in the script you pasted, why do you need to run sed across the gn files?
<daniels>
(not saying you're wrong at all, just curious)
<icecream95>
daniels: Arch Linux ARM has a different version of gn to what Perfetto downloads, and it seems they differ in whether .. can go below the root directory of the project
<daniels>
oh, weird
_whitelogger has joined #panfrost
_whitelogger has joined #panfrost
stikonas has joined #panfrost
raster has joined #panfrost
icecream95 has quit [Ping timeout: 246 seconds]
davidlt has joined #panfrost
macc24 has joined #panfrost
_whitelogger has joined #panfrost
nlhowell1 has quit [Ping timeout: 246 seconds]
macc24 has quit [Ping timeout: 256 seconds]
cwabbott has joined #panfrost
buzzmarshall has joined #panfrost
cwabbott has quit [Ping timeout: 246 seconds]
robmur01 has quit [Remote host closed the connection]