alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
tomeu has quit [Quit: Ping timeout (120 seconds)]
tomeu has joined #panfrost
urjaman has quit [Ping timeout: 240 seconds]
robertfoss has quit [Ping timeout: 240 seconds]
urjaman has joined #panfrost
robertfoss has joined #panfrost
nerdboy has quit [Ping timeout: 276 seconds]
stikonas has quit [Remote host closed the connection]
rcf has quit [Quit: WeeChat 2.5]
rcf has joined #panfrost
Stenzek has quit [Ping timeout: 276 seconds]
megi has quit [Ping timeout: 268 seconds]
Stenzek has joined #panfrost
vstehle has quit [Ping timeout: 250 seconds]
nerdboy has joined #panfrost
davidlt has joined #panfrost
camus has joined #panfrost
kaspter has quit [Quit: kaspter]
kaspter has joined #panfrost
camus has quit [Ping timeout: 265 seconds]
icecream95 has quit [Ping timeout: 265 seconds]
icecream95 has joined #panfrost
vstehle has joined #panfrost
Elpaulo has joined #panfrost
jschwart has joined #panfrost
guillaume_g has joined #panfrost
davidlt has quit [Ping timeout: 240 seconds]
jschwart has quit [Ping timeout: 245 seconds]
_whitelogger has joined #panfrost
chewitt has joined #panfrost
raster has joined #panfrost
icecream95 has quit [Ping timeout: 240 seconds]
xdarklight has quit [Ping timeout: 240 seconds]
xdarklight has joined #panfrost
xdarklight has quit [Quit: ZNC - http://znc.in]
xdarklight has joined #panfrost
megi has joined #panfrost
davidlt has joined #panfrost
<tomeu> alyssa: I'm back to looking at why some tests fail when run in different order, and was wondering if you had thoughts on what gets rendered for dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb in https://people.collabora.com/~tomeu/TestResults-bad.xml
<tomeu> looks a bit weird of an artifact
<tomeu> alyssa: another issue when running tests in a different order, but on the t760: http://paste.debian.net/1120899/
<tomeu> (the one before is on the t720)
<tomeu> alyssa: I know you don't trust fault_pointer, but for me it has been always telling where the problem was
<tomeu> wonder if there could be a dangling reference to the mali_sampler_descriptor
<tomeu> oh, back to the sampler_count issue :p
<tomeu> well, guess it's cool that the same difference appears in both t720 and t760 even if different stuff breaks
robmur01_ is now known as robmur01
megi has quit [Ping timeout: 276 seconds]
<tomeu> alyssa: you may be able to reproduce those problems on t860 if you run these tests in this order: http://paste.debian.net/1120908/
<alyssa> tomeu: For the dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb .xml you linked, the first thing I noticed is the issue is entirely along a triangle edge
<alyssa> tomeu: That pointer to sampler_descriptor would be telling, yes.
<alyssa> IME fault_pointer goes broadly to the right place, it just doesn't have the precision to make it overly useful.
<alyssa> tomeu: Running those 6 tests in that order, they all pass here.
<tomeu> damn
<tomeu> alyssa: I'm looking at adding a --no-shuffle flag to the runner :/
<alyssa> I mean
<alyssa> Shuffle is a good thing if we don't break when we do it ....
<tomeu> totally
<tomeu> quite important for product-readiness, IMO
<tomeu> alyssa: btw, wonder if we shouldn't be printing the whole cmdstream, without hiding stuff
<tomeu> and then have tools that one can optionally run to hide stuff or detect werird situations, etc
<tomeu> because I'm always afraid when hunting differences in the cmdstream of some being hidden
<alyssa> tomeu: I mean ... the goal of the pandecode stuff is that it doesn't hide anything that's not already as expected
<alyssa> If you want the verbosity, I mean, by all means we can add a verbose mode but :shrug:
<alyssa> I suspect that has a subtle bug but I don't know what.
<tomeu> alyssa: well, but our expectations may have bugs
<tomeu> and then we don't have tools to debug those
<alyssa> tomeu: If the code doing the expectation has a bug, then yes, that's a problem.
<tomeu> that's what I'm afraid of right now
<alyssa> If the expectation is just flat-out wrong, we do have a tool -- trace the blob; there should be no warnings/errors since that's ground-truth for us.
<tomeu> but what if a difference between the blob and panfrost is hidden because there's more than one possible value that matches the expectations?
<alyssa> tomeu: That's a bug in pandecode, then - expectations must be unambiguos.
<tomeu> alyssa: why do you think that that patch has a bug?
<alyssa> tomeu: It's right near when the flake starts coming, and everything else around it is unambiguously correct; that's the only one that's complex enough to cause issues I think
<alyssa> But not sure.
<alyssa> since that's coming up green in CI somehow.
<tomeu> grr, when I was young, tests either passed or failed
<tomeu> this flakes thing is pure nonsense
<alyssa> tomeu: FWIW, even without expectation checking / hiding, pandecode is necessarily lossy.
<alyssa> There's always the possibility that a struct is larger than we expect and there are fields after it that we just don't notice.
<alyssa> The only way around that is to do a complete hexdump of the entirety of mapped memory.
<alyssa> And -- to be clear -- that is *exactly* what we did in the early days of panfrost when we didn't know the sizes of anything.
guillaume_g has quit [Quit: Konversation terminated!]
<alyssa> Nowadays that is ... not helpful.
robertfoss has quit [Ping timeout: 240 seconds]
robertfoss has joined #panfrost
<alyssa> Oh come on.
<alyssa> If I have just the FBD change, green.
<alyssa> If I have just the stack size change, green.
<alyssa> If I have both, flake.
<alyssa> This is disturbing :|
enunes has quit [Read error: Connection reset by peer]
nerdboy has quit [Ping timeout: 268 seconds]
enunes has joined #panfrost
<alyssa> but now
<alyssa> uck.
enunes has quit [Read error: Connection reset by peer]
megi has joined #panfrost
enunes has joined #panfrost
<alyssa> Now the flake seems to have gone away
<alyssa> tomeu: I think one of your ubsan changes might've did it.
<alyssa> Or dumb luck.
<alyssa> ....But then I had messed with the skips file so that might be spurious uhm rerunning more CI.
cowsay_ has joined #panfrost
cowsay has quit [Ping timeout: 276 seconds]
chewitt has quit [Read error: Connection reset by peer]
chewitt has joined #panfrost
<alyssa> Meh.
<alyssa> Code is landed, so current panfrost master should work with your favourite apps/games/whatever
<alyssa> glamor, neverball, webgl stuff, etc. seem to work okay now.
<sravn> alyssa: how far is panfrost from supporting Chromium? Or maybe it is Chromium that does not yet support panfrost?
<alyssa> sravn: Not sure how much we're missing, iirc it was just buggier than I was comfortable with
<alyssa> Firefox works wonderfully though!
<anarsoul> alyssa: great!
<alyssa> anarsoul: Don't say great yet, I broke CI thanks to a flake :V
<anarsoul> what's with CI?
<alyssa> test is failing in master but only sometimes
<anarsoul> ah, the nastiest kind of failures
<sravn> alyssa: thanks for the quick Chromium update. We have at work some HTML app thingy that today runs only on Chromium - on top of i.MX with etnaviv.
<sravn> With all the wonderful work you and others do we have a much wider choice in the future.
<alyssa> Oh!
<sravn> That future may also bring another browser :-)
<anarsoul> oh nooo
raster has quit [Quit: Gettin' stinky!]
<daniels> alyssa: if you've found the flake, please push a commit to add it to the flake list with my A-b
stikonas has joined #panfrost
nerdboy has joined #panfrost
Lyude has quit [Quit: WeeChat 2.4]
Lyude has joined #panfrost
<alyssa> daniels: It's.. not one flake..
<alyssa> daniels: If I ban the flaking test, 1 other test flakes, etc
<alyssa> if I ban the whole section, one test from the next section flakes
<daniels> eugh ... state leaks then?
<alyssa> Probably.
<daniels> but if you write it off as a flake, it'll still be executed, right? so you eat the flakiness in one spot, and have time to hunt the leak later
<alyssa> Flake list is stuff that's not executed
<alyssa> Which is why it just moves the flake elsewhere
<alyssa> So yeah, a state leak seems likely, but I'm not seeing anything
<alyssa> Oh, here's a theory ... Maybe a BO is being used unintitalized
<alyssa> so state is leaked that way
jschwart has joined #panfrost
<daniels> ahh, I thought that flake still ran but ignored the result :\
<alyssa> Nope
<alyssa> Also I tried a whole series of fixes
<alyssa> Still the one flake.
<alyssa> And I can confuse this is nondeterministic
<alyssa> (I have a green CI run)
<alyssa> (Buried between red runs)
<daniels> woop.
<alyssa> :|
* alyssa doesn't know what to do about the flake
abordado has joined #panfrost
<urjaman> can you skip the one before the flaking one?
<urjaman> or does that have no effect?
davidlt has quit [Ping timeout: 240 seconds]
jschwart has quit [Ping timeout: 245 seconds]
jschwart has joined #panfrost
warpme_ has quit [Quit: Connection closed for inactivity]
jschwart has quit [Ping timeout: 268 seconds]
<abordado> alyssa: if you're happy with it, can you merge https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036 ?
tlwoerner has quit [Quit: Leaving]
vstehle has quit [Ping timeout: 252 seconds]
vstehle has joined #panfrost