<alyssa>
Very often tests that fail are tests that fault
<alyssa>
e.g. the FBO tests
chewitt has quit [Quit: Zzz..]
chewitt has joined #panfrost
<tomeu>
alyssa: when I enabled kernel messages the other day, no faults were visible
<tomeu>
and I think an earlier run today showed just one gpu timeout
<alyssa>
tomeu: We're running new tests because of the skip list, no..?
<tomeu>
but I should have been running with the same skip list already
<alyssa>
:|
<bbrezillon>
robher: I think we have a problem with the gem_close() logic
<bbrezillon>
userspace might close GEMs that are still referenced by the GPU
<bbrezillon>
(because the job has been queued but not yet executed or is being executed)
<bbrezillon>
in the panfrost_gem_close() function, we tear down the MMU mapping and release the mm_node
<bbrezillon>
which leads to pagefaults
<bbrezillon>
2nd problem we have is with the GEM shrinker
<alyssa>
tomeu: the good news is I have some code for you to debug ;)
<alyssa>
lava-ci-small-tiling
<alyssa>
It's rather crude but it should implement all the cases I know of.
<alyssa>
No heuristic but should bring to parity with T860 tiler code.
<bbrezillon>
AFAICT, drm_gem_shmem_is_purgeable() does not take the fact that the BO might still be used by the GPU, even though userspace marked it as purgeable
<tomeu>
alyssa: nice! will integrate
<tomeu>
alyssa: if we run tests that cause faults, then we should rerun failed tests individually at the end, otherwise random tests will randomly fail at random
<tomeu>
because in the skips file we don't only have flip-flops, but more importantly tests that cause otherwise stable tests to flip-flop
<tomeu>
with the number of failures that we have atm, rerunning tests should cost us only a couple of seconds
<tomeu>
when we start running on gles3 it will be a different matter, but maybe we'll want to start with a massice skips file there
<alyssa>
Meep.
<alyssa>
tomeu: lmk if that branch, like, breaks everything
<alyssa>
it is 100% untested as far as t720 goes ;p
<tomeu>
alyssa: regarding "...We really need a quirks framework...", I thought we would be going with something ala MIDGARD_ADVANCED_TILING_UNIT for 720, 820 and 830
<alyssa>
but I trust you'll figure out how to debug it :)
<tomeu>
alyssa: want me to take the skips and tiling branches into my next MR?
<tomeu>
as it's all interdependent
NeuroScr has quit [Ping timeout: 276 seconds]
fysa has joined #panfrost
fysa has quit [Remote host closed the connection]
<robmur01>
bbrezillon: yeah, raster and I have been noticing that too - one thought was that the job might need to hold a reference on the AS, to prevent that being pulled out from underneath still-referenced BOs
<bbrezillon>
robmur01: AFAICT you'd need more than that
<alyssa>
tomeu: Sure
<alyssa>
tomeu: Besides those, is there any upstreaming left?
<alyssa>
(and CI?)
<tomeu>
alyssa: I think that's all
<tomeu>
alyssa: want me to add a MIDGARD_ADVANCED_TILING_UNIT quirk?
<tomeu>
alyssa: looks like I should start debugging the tiling patch :)
<alyssa>
tomeu: :)
<alyssa>
If you want to add a screen->quirks field more broadly that covers both tiling and SFBD and eventually errata that could be done
<alyssa>
Or if we want features/issue testing separately like the kernel
* alyssa
shrugs
<tomeu>
alyssa: I thought our quirks thing would cover all of that
<alyssa>
tomeu: Sounds good
<tomeu>
ack!
<alyssa>
quack!
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
chewitt has quit [Quit: Zzz..]
<tomeu>
alyssa: is it expected that the polygon list size for 0x41 is just 0x200? the blob uses 0xff200