<alyssa>
The regions marked as bad have the same rgb except with non-zero red values, whereas reference has 0 red
<alyssa>
They're very much diferent; you just need to check in Krita since your eyes don't see the difference
<daniels>
jrb: anything in particular you're interested in?
<HdkR>
alyssa: Interesting uniform indexed output. I didn't believe that to be possible but maybe there was something special for that case
stikonas_ is now known as stikonas
<bbrezillon>
alyssa: hm, am I missing something or are we really leaking the ins and block objects allocated in midgard_compile.c?
<alyssa>
Bwap?
<alyssa>
HdkR: Or maybe it's just dEQP being silly
afaerber has joined #panfrost
<alyssa>
If it is legal and works as it appears, there's a bug higher up in the stack
<alyssa>
I don't think it does.
<alyssa>
bbrezillon: We probably are, yes.
<HdkR>
alyssa: Trying to find the documentation now. It is definitely still mandated to be constant index in ES 320
<HdkR>
alyssa: Wow. Apparently it was never restricted explicitly in the GLSL API
<HdkR>
and there isn't even a note about it in spec that it is allowed
<HdkR>
ESSL is nicer and states. "Fragment shader outputs declared as arrays may only be indexed by a constant integral expression."
<HdkR>
AMD shader compiler generates some very nasty code around it
<alyssa>
HdkR: Allowed by default? Ninth Amendment and all?
* alyssa
or is it the tention
<alyssa>
*tenth
* HdkR
shruggies
<HdkR>
I couldn't find anything explicitly banning it in any of the GLSL desktop specs
<alyssa>
Anyway, compiler woe of the day:
<alyssa>
I need to rewrite all the liveness analysis data structs
<alyssa>
Instead of just a range, it has to be live in/out on a per block basis, with a range within some blocks
stikonas has quit [Remote host closed the connection]
<alyssa>
So, the new algorithm will be as follows:
<alyssa>
On each midgard_block, maintain bitfields (size = number of nodes):
<alyssa>
- defs: Is this node written somewhere in this block?
<alyssa>
- uses: Is this node read somewhere in this block?
<alyssa>
- live_in: Is this node live at the beginning of this block?
<alyssa>
- live_out: Is this node live at the end of this block?
<alyssa>
Also on each midgard_block, maintain arrays of live_start and live_end, size = number of nodes, for the index within a block that it is written/finally used.
<alyssa>
Given two nodes i and j, we can decide if they interfere by checking:
<alyssa>
- For each block:
<alyssa>
- If both nodes are live_in or both nodes are live_out, they interfere.
<alyssa>
--------Ack I'm drowning in not knowing anything about dataflow analysis
<alyssa>
HdkR: Help? :f
<HdkR>
Welcome to analysis passes :D
<alyssa>
HdkR: I'm so confused
<alyssa>
I just read v3d's liveness analysis pass top to bottom
<alyssa>
Doesn't seem too hard but still sort of magic
Elpaulo has quit [Ping timeout: 272 seconds]
<HdkR>
I had a friend who had to spend months doing different analysis passes and it made him want to quit his job. It can be mentally deteriorating at times :)
<alyssa>
HdkR: Thanks for the encouragemnt
<HdkR>
Take your time, expect problems to crop up. It isn't impossible
<alyssa>
Hm
<HdkR>
Complex control flow will break most assumptions
<alyssa>
HdkR: Should I have some sort of better infra built-up to make this tractable?
<alyssa>
I know good utilites/data structures can make a world of a difference for preserving sanity with hard algorithms..
<HdkR>
It can do. Tooling to visualize the CF with all this data was really helpful
<alyssa>
Hmm reading freedreno's RA as well, this is slowly clicking
<alyssa>
HdkR: I think I'll get this but definitely not a 5pm task
<HdkR>
definitely
<HdkR>
One of those things you can spend multiple weeks worth of effort on :P
<HdkR>
(Depending on what exactly you need to get)
<alyssa>
HdkR: (Symptom being a Krita shader failing RA because of an unfortunate interaction of texture pipeline registers + complex control flow)
<HdkR>
ah, fun
<alyssa>
HdkR: That's a GL3 krita shader, tho
<alyssa>
The GL2 renderer (which I guess is fixed-function) seems to work :)
<HdkR>
Nice
<alyssa>
Perf is fine but not obviously better than sw, but sw is fast enough half the time?
<alyssa>
I guess it's kinda bw bound anyway
<HdkR>
Lowering CPU load would be a benefit at least
<alyssa>
HdkR: CPU load is still pretty high but not in GL related stuff
<alyssa>
I think the GL2 renderer is still doing all the hard stuff in sw and just using GL for blits and stuff
<HdkR>
Yea, just means you can throw more CPU at something else rather than llvmpipe :D
<HdkR>
Interesting
<alyssa>
HdkR: sw meaning Krita's internal software renderer, which is like 30x faster than Krita's GL renderer with llvmpipe
<HdkR>
aaaah
<HdkR>
I see
<alyssa>
Almost all the Krita I've done has been with that sw renderer on other chromebooks and it's ...
<alyssa>
Krita is drastically faster than Firefox rendering the Discord web client.
<HdkR>
If you're doing GL 3 things then you can start adding panfrost to features.txt :D
<alyssa>
HdkR: For GLES 3.0, we're currently at where Panfrost was for GLES 2.0 at the beginning of 2019
<HdkR>
But with a much more robust shader compiler
<alyssa>
("All the features are there but they're all half-broken but it's good enough for some stuff to work but not anything real world also I forgot about a bunch of minor features and corner - and not so corner - cases of big features0
<alyssa>
That's true, we're doing overwhelmingy well on the shaders.* tests specifically since I designed the shader compiler to be ES3-class from early on
<alyssa>
ALso, somebody broke GALLIUM_HUD, it was probably me