Mistah_Darcy has quit [Remote host closed the connection]
vstehle has joined #panfrost
warpme_ has joined #panfrost
NeuroScr has joined #panfrost
MistahDarcy has quit [Ping timeout: 245 seconds]
yann has quit [Ping timeout: 245 seconds]
rcf has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
guillaume_g has joined #panfrost
raster has joined #panfrost
griffinp- has joined #panfrost
warpme_ has quit [Quit: warpme_]
NeuroScr has quit [Quit: NeuroScr]
rcf has quit [Ping timeout: 245 seconds]
anarsoul has quit [Remote host closed the connection]
anarsoul has joined #panfrost
rcf has joined #panfrost
warpme_ has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
warpme_ has quit [Quit: warpme_]
warpme_ has joined #panfrost
chewitt has joined #panfrost
janrinze has quit [Remote host closed the connection]
warpme_ has quit [Quit: warpme_]
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
guillaume_g has quit [Remote host closed the connection]
guillaume_g has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
warpme_ has joined #panfrost
afaerber has quit [Quit: Leaving]
raster has quit [Ping timeout: 268 seconds]
yann has joined #panfrost
raster has joined #panfrost
warpme_ has quit [Quit: warpme_]
afaerber has joined #panfrost
warpme_ has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
chewitt has quit [Remote host closed the connection]
BenG83 has joined #panfrost
rcf has quit [Quit: WeeChat 2.1]
rcf has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
yann has quit [Ping timeout: 244 seconds]
chewitt has joined #panfrost
<chewitt>
Lyude: any news on T820 dEQP tests?
afaerber has quit [Quit: Leaving]
<chewitt>
sorry to nag.. but..
<Lyude>
chewitt: haven't had a chance to finish getting the system ready :s, and it's no problem
<chewitt>
what's needed to get it ready?
<Lyude>
at the latest I can get it ready this weekend
<Lyude>
chewitt: just figuring out the scpi issues
<chewitt>
odd.. I don't see any scpi issues
<chewitt>
although I've no idea if anything scpi related is enabled in our kernel
yann has joined #panfrost
<Lyude>
chewitt: I'll try to check for differences between our configs asap
<Lyude>
maybe I can wfh a bit today and look, not sure
<Lyude>
chewitt: I can get you the backtrace when i get into the office if that would help
<chewitt>
might be optimistic with my knowledge of kernel things :)
<chewitt>
but happy to look
ente has quit [Remote host closed the connection]
<xdarklight>
Lyude: SCPI on your Khadas VIM2? I remember having issues with that occasionally, my workaround was to "reboot harder" (there's a dedicated reset button for that purpose)
* alyssa
isn't sure what to prioritize next
<alyssa>
I'm stuck on memory manage stuff
<alyssa>
I.. guess more compilering
<alyssa>
How about some load/store RE, that will help with a lot I guess
<Lyude>
xdarklight: there is definitely a way to fix it though, my older kernel doesn't have the issue at all
<Lyude>
I'm also gong to check through my logs when I get the chance and see if I possibly discussed this with narmstrong in the past already
stikonas has joined #panfrost
chewitt has quit [Quit: Zzz..]
<alyssa>
Well, I still haven't cracked how load/store ops work (esp. with indirection and such)
<alyssa>
There doesn't seem to be a single "in/direct?" bit
<alyssa>
I did determine that for OpenCL load/store ops, when they are direct the address comes from:
<alyssa>
- r26/r27 (selected by 0x400 bit)
<alyssa>
- X/Y (selected by 0x200)
<alyssa>
Roughly "register select" and "upper" bits
<alyssa>
But for OpenCL indirect, the address comes from:
<alyssa>
- seemingly always r27? haven't seen one with r26
<alyssa>
- X/Y (selected by 0x2 bit)
<alyssa>
It is of course possible the register select bit is 0x4, which is set in each case I've seen
<alyssa>
But this is uncomfortable since it means symmetry isn't preserved.
<alyssa>
Which means time travel of the second form becomes possible
<alyssa>
("This is #panfrost, not #mlpfanfic" "Sorry, wrong channel.")
<alyssa>
In the indirect load case, the offset register is selected by the bits in 0x700
<alyssa>
So 0x400 is the register select bit and 0x300 is the component select
<alyssa>
There *is* some symmetry here with how their direct variants are addressed, we can think of the 0x200 select bit as meaning "component z"
<alyssa>
So in that sense, this is somewhat less problematic than initally feared
<alyssa>
But it still doesn't explain the shift, unless you make one ugly, ugly concession: load/store ops just aren't symmetric like texture ops
<alyssa>
They're more like ALU ops with variable numbers of arguments
raster has quit [Remote host closed the connection]
<alyssa>
Now, let's turn back to OpenGL with all that in mind:
<alyssa>
Take UBOs for example.
<alyssa>
Direct: 0x1E00
<alyssa>
Indirect: 0x8700 (offset in r27.w)
<alyssa>
direct: 0x1E00
<alyssa>
We see that r27.w represented in the 0x0700, like the CL address case. The first argument, right?
<alyssa>
Same pattern in indirect varyings:
<alyssa>
direct: 0x1E9E
<alyssa>
Indirect: 0x079E (r27.w)
<alyssa>
And then the weird ones that defy all logic, e.g. cubemap stores:
<alyssa>
0x0024 (r27.xyz)
<alyssa>
Although I suppose that's explained as the bottom 0x7 indexing r27.x
<alyssa>
Now, speaking of directs, let's throw immediates into the mix
<alyssa>
The `varying_parameters` field, shifted once, stores a byte offset that is added to every load/store in CL
<alyssa>
(The same mechanism is used with the same ops with register spilling but let's not go there yet)
<alyssa>
So an indirect CL load really has three arguments we don't decode: address, register offset, immediate offset
<alyssa>
So, the unknown field we have is 16-bit
<alyssa>
It _may_ make sense to divide this into two 8-bit fields, one for each source
<alyssa>
That explains the shift
<alyssa>
But we're still left with the question of what are the other 5-bits of the 8-bit source going two, if 3-bits index a register
<alyssa>
Well, it depends, I guess.
<alyssa>
Let's go to UBOs:
<alyssa>
0x87 01 (split for clarity)
<alyssa>
That means "read from UBO 1 with offset in r27.w"
<alyssa>
More generally, that second source is the UBO number in the bottom 4-bits and, uh, zero in the top four
<alyssa>
The 0x87 being bottom 3-bits (or 4?) indexing a register
<alyssa>
We might speculate that each 8-bit source is broken up into a 4-bit type and 4-bit value, but I'm not convinced that makes a *ton* of sense
<alyssa>
Recall direct UBO source 0x1E would be a no-op source since there is no special offset. Wha??
<alyssa>
But pressing onwards..
<alyssa>
Take the CL loads:
<alyssa>
Direct: 0X (address), 7E (no-op?)
<alyssa>
Indirect: 4X (offset), EX (address)
<alyssa>
E and 0 don't match, so this is a gaping hole in the theory
<alyssa>
The E's do match... not sure what to think about those shifts
jolan has quit [Quit: leaving]
rhyskidd has quit [Remote host closed the connection]
<alyssa>
So the question is "how does the hw know which argument is which"? or (equivalently?) "what are the other bits for"?
* alyssa
sees what Bifrost does
rhyskidd has joined #panfrost
<alyssa>
;/
<alyssa>
Nothing obvious in lima eithe
<anarsoul>
what's about lima?
<alyssa>
anarsoul: I was hoping Mali-PP had something similar to those mystery bits but it doesn't
<anarsoul>
what bits?
<alyssa>
anarsoul: Read the above rant
<anarsoul>
it's 2 screens, can you summarize it?
<alyssa>
anarsoul: 16-bit field in load/store ops used to encode additional sources (indirect offsets, etc)
<anarsoul>
oh
<anarsoul>
we don't use those in lima, maybe cwabbott can give you a clue
NeuroScr has joined #panfrost
<alyssa>
More rant, just uniform access this time:
<alyssa>
Indirect varyings.. can't have an immediate offset added (since those bits are used for their parameters)
<alyssa>
Funnily enough their unknown is invariant to size
<alyssa>
Gah, I have no idea.
<alyssa>
These fields are very much per-opcode
<alyssa>
and variadic and :/
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
adjtm has quit [Quit: Leaving]
warpme_ has quit [Quit: warpme_]
davidlt has quit [Ping timeout: 245 seconds]
* alyssa
shrugs, we'll revisit I guess
<alyssa>
Back to.. not sure what, um
adjtm has joined #panfrost
stikonas has quit [Read error: Connection reset by peer]
stikonas_ has joined #panfrost
<alyssa>
Z32F_S8, that's what
rhyskidd has quit [Remote host closed the connection]
rhyskidd has joined #panfrost
<HdkR>
woo depth and stencil
<Lyude>
HdkR: on bf?
<HdkR>
nah
<HdkR>
Z32F_S8
<Lyude>
ahh
BenG83 has quit [Quit: Leaving]
MistahDarcy has joined #panfrost
afaerber has joined #panfrost
stikonas_ has quit [Ping timeout: 245 seconds]
stikonas_ has joined #panfrost
rhyskidd has quit [Remote host closed the connection]
rhyskidd has joined #panfrost
<alyssa>
TIL our stenciling implementation is probably terribly broken and nobody noticed because the CTS doesn't really test them when running as a window
<HdkR>
Khronos CTS doesn't test stencil?
stikonas_ has quit [Ping timeout: 246 seconds]
stikonas has joined #panfrost
<alyssa>
HdkR: dEQP, and the tests always pass if you don't use an FBO
<alyssa>
since glReadPixels doesn't work on scanout with stencil