jcureton has quit [Remote host closed the connection]
<tomeu>
alyssa: btw, CI runs take much shorter now
<tomeu>
guess it's due to the BO cache?
raster has joined #panfrost
pH5 has joined #panfrost
raster has quit [Remote host closed the connection]
raster has joined #panfrost
yann has joined #panfrost
<daniels>
alyssa: eglChooseConfig() generally takes channel sizes as _at least_, so will use a config with wider channels than requested if one is available
adjtm has joined #panfrost
<tomeu>
alyssa: when marking all BOs except ctx->shaders as non-executable, I get an invalid access when trying to execute at a weird address: http://paste.debian.net/1093867/
<tomeu>
the fault only happens once job_2fb0480_17 gets submitted
<tomeu>
which I guess it's just a clear?
herbmillerjr has quit [Ping timeout: 248 seconds]
herbmillerjr has joined #panfrost
<alyssa>
tomeu: I noticed that, wasn't sure what to thank/blame
<alyssa>
But armhf runs are.... getting more problematic by the day
<alyssa>
tomeu: Umm we sometimes shove blend shaders in transient memory
<alyssa>
which might be dumb but oops
<tomeu>
because of how much we allocate (for now) in each context, and because of how often deqp allocates contexts, I think it's the BO cache what speeded it up
<tomeu>
alyssa: aha :)
<tomeu>
definitely looks like that
<alyssa>
tomeu: Yeah, it's a Bug (TM)
<alyssa>
But I was lazy and it didn't break anything at the time even though I felt bad about it and still kinda do
<tomeu>
alyssa: it's just a one-liner :p
belgin has joined #panfrost
<belgin>
hello
<belgin>
so you don't like x11?
<belgin>
i have found out something fun about it today
<belgin>
if you strip your window of wm decorations and set it to the same size as the "root" window, it just ignores all resize attempts afterwards
<belgin>
however, if you set one of the dimensions off by 1, it works normally
<belgin>
take that x11
jcureton has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
<jcureton>
the good news is my two T720 platforms are consistent on the dEQP blend tests, the bad news is they both pass 0% :)
<tomeu>
jcureton: do you have an idea already of what's wrong?
<alyssa>
tomeu: Unfortunately, I don't think that's the right approach
<alyssa>
It *will* work, but it'll also leak memory terribly
<jcureton>
tomeu: not quite yet. pass rate for all of dEQP-GLES2.* on Allwinner H6 T720 is quite low - 7.1%
<alyssa>
tomeu: Anything uploaded to the shaders BO will stay alive for forever.
<alyssa>
jcureton: Ouch D:
<tomeu>
alyssa: ah, the transient pool has magic to know what can be released?
<alyssa>
tomeu: Transient memory, by definition, is freed automatically every frame
<tomeu>
ah, all of it?
<alyssa>
All of it!
<alyssa>
Think of `transient` like the stack, and main BOs like the heap.
<tomeu>
gotcha
<alyssa>
Small allocations within the function itself you put on the stack
<alyssa>
But something big, or something you need to persist after we return, go to the heap since the stack will be destroyed
<alyssa>
But allocation/freeing on the heap can be expensive (mitigated somewhat with the BO cache), whereas stack allocations/frees are essentially free
<tomeu>
hmm, wonder how bad it could be to mark the transient pools executable
<tomeu>
quite bad, I think
<alyssa>
tomeu: Pretty bad.
<tomeu>
and have a executable transient pool?
<alyssa>
That's an option
belgin has quit [Quit: Leaving]
<alyssa>
Another option is just doing proper memory manegemnt on ctx->shaders
<alyssa>
But so far that's been quite low-prio since shaders are small.
<cwabbott>
can't you just never free blend shaders except on context destruction? afaik that's what the blob does
<cwabbott>
after all, you'll never know when you'll need it again
<alyssa>
cwabbott: Mostly we do that
<alyssa>
cwabbott: The exception is glBlendColor().. afaik, there's no provision for passing it (via a uniform or whatever)
Prf_Jakob has joined #panfrost
<alyssa>
So we just patch it into the binary directly
<alyssa>
But BlendColor is not tied to the CSO in Gallium
<alyssa>
So the easiest approach is, for shaders with a BlendColor, to reupload once a frame. It's an edge case that doesn't come up much outside dEQP anyway, so I'm not too worried.
<cwabbott>
oh, that sounds like fun indeed
<alyssa>
cwabbott: Blending is by far the most complex part of the driver :(
<cwabbott>
it sounds like you need another hash table with (blend CSO, blend color) -> patched shader
<cwabbott>
alyssa: you should have a look at the shaders the blob uses for indirect draws / geometry shaders / tess shaders :)
<alyssa>
cwabbott: I saw the indirect draw mess yesterday
<alyssa>
Needless to say I aim for ES3.0 only ;)
<cwabbott>
I think it's just doing all the magic instance division stuff in the shader
<cwabbott>
instead of in the driver
<alyssa>
Well, the instancing magic is `only` 350 lines in Panfrost so I guess 200 lines of Midgard asm is reasonable t_t
<tomeu>
jcureton: I'm asking because I used to test on a H6 and things didn't seem that bad
<jcureton>
tomeu: i guess anecdotally most stuff seems to run without many issues. most of the ones i've run into have been blending related
<tomeu>
ah, I see
<tomeu>
jcureton: do you get anything in dmesg?
<jcureton>
not in running dEQP, although with most applications the first job submission always results in a DATA_INVALID_FAULT before continuing
<jcureton>
occasionally I'll see a job timeout, particularly running glmark. there are also some graphical glitches in glmark, although i don't remember which tests off my head
<tomeu>
ok, so rendering with blending is just wrong
<jcureton>
^ yes
<tomeu>
ah, and some other issue I guess
<alyssa>
Blending on SFBD is pretty different from MFBD
<alyssa>
So I'm willing to believe we (...I) butchered it
<tomeu>
I could undust my H6, but not having the board in CI means it will break again
<alyssa>
jcureton: The firs tjob submit thing is probably why dEQP results are so terrible
<jcureton>
hmm, maybe. dEQP doesn't trigger any faults in the kernel