alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
nerdboy has quit [Ping timeout: 260 seconds]
stikonas has quit [Remote host closed the connection]
<icecream95> With 5.5-rc6, even just resizing windows causes a bunch of page faults
<icecream95> *X11 windows - Wayland still works fine.
<icecream95> Xalius: So it looks like it is a kernel regression
<icecream95> I compiled the 5.4.0 panfrost onto 5.5, and that has problems too...
<icecream95> But it's fine after applying https://github.com/bbrezillon/linux/commits/panfrost/5.4-fixes
<icecream95> So it looks like some of the fixes from that branch didn't make it into 5.5
<icecream95> I don't think that will apply on to 5.5, so I'll have to try and manually patch the kernel
warpme_ has quit [Quit: Connection closed for inactivity]
_whitelogger has joined #panfrost
megi has quit [Ping timeout: 265 seconds]
davidlt has joined #panfrost
<icecream95> Here is the patch rebased onto 5.5: https://gitlab.freedesktop.org/snippets/812
<icecream95> Xalius: Does the patch fix your problems?
suihkulokki has quit [Ping timeout: 260 seconds]
jailbox has quit [Ping timeout: 268 seconds]
nerdboy has joined #panfrost
<tomeu> bbrezillon: any ideas what we are missing in master? ^
<icecream95> The texture glmark2 test spends about 20% of the time iterating past null (!) fences in panfrost_flush_all_batches
<rellla> alyssa: do dEQP-GLES2.functional.fragment_ops.depth_stencil.* pass for panfrost if some stencil op is DECR/INCR/DECR_WRAP/INCR_WRAP and the write mask is one with the lower bits not set?
<rellla> these fail on lima, so i wonder if other solve/workaround this
<tomeu> icecream95: :D
<tomeu> icecream95: any idea of why?
<icecream95> I didn't follow the asm properly and it's actually null readers in panfrost_bo_access_gc_fences
<icecream95> Is there a NEON instruction to check if all the bits in a vector are 0?
<icecream95> It looks like the array is only cleared if all the readers are null
<icecream95> It would probably be faster to just copy the still-alive readers to the start of the array then resize at the end
<tomeu> icecream95: I don't think that function should be called that often though
<tomeu> I'm also seeing lots of time spent there in STK
<tomeu> nothing what that function does should be time critical, I think there must be some logic problem that manifests like that
<bbrezillon> tomeu: the last 2 patches are missing
<bbrezillon> one of them has been respinned by robher
<bbrezillon> icecream95: ack on getting rid of NULL readers and resizing the access->readers array
<bbrezillon> but I'm not sure it will drastically improve things
<bbrezillon> maybe we're just calling panfrost_bo_access_gc_fences() too aggressively
<bbrezillon> tomeu, robmur01, robher: what's the plan regarding those 5.4 fixes, should I send a v3?
<icecream95> 2848edc0eff5570abaac0a4017a9c96ebabbd728 is the first bad commit
<icecream95> panfrost: Fix panfrost_bo_access memory leak
<icecream95> I messed up with perf usage, that wasn't the commit introducing the problem...
yann has quit [Ping timeout: 258 seconds]
<tomeu> icecream95: bbrezillon: just to be sure: you know that the readers arrays just keep growing unbound?
<icecream95> That's what I was just about to say...
<icecream95> Most of the time nreaders < 3
<tomeu> bbrezillon: do you know what may be going on?
<bbrezillon> icecream95: you can try with http://code.bulix.org/wz85pf-1089443
<bbrezillon> adjust the garbage collection frequency if it's still called too often
pH5 has joined #panfrost
<bbrezillon> tomeu: I don't know yet
<icecream95> The array is only cleared if there are no active readers
<bbrezillon> tomeu: duh, 3862 readers!
<tomeu> bbrezillon: I think the gc frequency should be fine, if readers didn't accumulate
<tomeu> better to stop doing insane things, rather than doing them less often :p
<bbrezillon> I didn't know there was so many readers
<tomeu> access*readers gets quite big
<bbrezillon> I suspect one of the reader is never signaled
<bbrezillon> tomeu: do you know how many of them are active (not signaled)?
<icecream95> I'll repeat if no-one read: *No elements are removed* except when there are *no* active readers
<icecream95> There are usually less than 3 active
<tomeu> icecream95: yeah, that's what bbrezillon is thinking of
<bbrezillon> icecream95: yes, but I wonder why we have unsignaled readers left
<tomeu> guess 3 active in the just submitted batch is fine, but wonder about previous batches
<icecream95> All the elements in the array except the 1-2 active ones are just NULL pointers
<bbrezillon> icecream95: shrinking the array is indeed a good idea
Elpaulo has joined #panfrost
warpme_ has joined #panfrost
<icecream95> I have a working patch; I'll make an MR tomorrow
icecream95 has quit [Ping timeout: 240 seconds]
guillaume_g has joined #panfrost
yann has joined #panfrost
<tomeu> bbrezillon: ok, I will move to see why our gles3 results are unstabl
Xalius has joined #panfrost
<Xalius> moin
<Xalius> icecream95, I'll try https://gitlab.freedesktop.org/snippets/812 thanks
<Xalius> I think I saw some of those on patchwork, did not all of them make 5.5-rc?
karolherbst has quit [Ping timeout: 272 seconds]
karolherbst has joined #panfrost
raster has joined #panfrost
Xalius has quit [Remote host closed the connection]
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
indy has quit [Ping timeout: 272 seconds]
Elpaulo has quit [Read error: Connection reset by peer]
Elpaulo has joined #panfrost
megi has joined #panfrost
<tomeu> narmstrong: should I expect the wifi on the nexbox to work?
<narmstrong> tomeu: not sure
<narmstrong> nop, definitely no
<narmstrong> the sdio stuff hasn't been pushed
<tomeu> hmm, just found a mention to it in https://patchwork.kernel.org/patch/10483835/
davidlt has quit [Ping timeout: 258 seconds]
<narmstrong> at the time, the qca9377 ath10k was out of tree, now it should work
buzzmarshall has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
<tomeu> I see
<alyssa> ^ Let's play spot the bug :V
alpernebbi has joined #panfrost
davidlt has joined #panfrost
jstultz has quit []
jstultz has joined #panfrost
<alyssa> I found it. It's something silly
<alyssa> tomeu: ^
<tomeu> ah, cool
<robmur01> bbrezillon: I guess - I don't have much time myself to review/test things just now, but IIRC they all looked pretty reasonable at a glance.
<robmur01> alyssa: Hopefully the compiler sees it anyway, but "if (b != a) { c = a - b; ...}" -> "c = a - b; if(c) {...}" ;)
<alyssa> robmur01: Cute ;)
* robmur01 is currently elbow-deep in optimisation...
<robmur01> (of something completely unrelated, unfortunately)
<alyssa> Yeah?
<alyssa> Okay, got it workign!!
<alyssa> Never mind that it totally breaks GNOME, *cough*
<daniels> alyssa: SubTexImage into AFBC?
Elpaulo has quit [Ping timeout: 240 seconds]
karolherbst has quit [Ping timeout: 260 seconds]
karolherbst has joined #panfrost
yann has quit [Ping timeout: 265 seconds]
pH5 has quit [Quit: bye]
krh has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
yann has joined #panfrost
<alyssa> daniels: Hmm?
<alyssa> AFBC isn't supported right now (well, it is, but broken and hidden behind a debug flag)
<daniels> alyssa: was trying to figure out the purpose of foo.c
<alyssa> daniels: oh, the issue is that we have a fast path for tiling tile-aligned regions
<alyssa> but if the update isn't tile-aligned, there's a lot of redundancy required in the slow path
<daniels> ah ok
<alyssa> ..so the trick is to splice it up into a tile aligned centre and borders
<daniels> indeed
<alyssa> Tiling patches seem to help performance substantially
<alyssa> For non-game workloads (desktop environments, GIMP, etc.)
<alyssa> should probably do more rigorous testing but still
bbrezillon has quit [Ping timeout: 260 seconds]
<anarsoul> alyssa: don't forget to add lima label to your MR
Xalius has joined #panfrost
<Xalius> icecream95, your patchset worked
<Xalius> writing this from hexchat in Xwayland ;)
<anarsoul> cool
<Xalius> Gimp works too without crashes, but it's slower than with glamor disabled
<Xalius> font rendering looks different
<Xalius> I applied the set on top of 5.5-rc5 that seems to work
<alyssa> Xalius: working on it :)
<Xalius> my mesa is a couple days old, do I need to upgrade? :P
<alyssa> Xalius: not yet, still WIP
bbrezillon has joined #panfrost
<alyssa> Actually should put WIP: on there
<alyssa> not ready for review, still hacking at stuff
<alyssa> probably ready for testing on lima
<alyssa> anarsoul: ^^
<Xalius> I haven't tried sway on lima, should do that
<anarsoul> alyssa: I'll test it later today
<alyssa> anarsoul: Cool
<anarsoul> texture tests in deqp spent 30% of time in tiling routines
<alyssa> Next up will be templating the optimized routine (with macros) so we get a fast path for bpp1/2/4/8
<anarsoul> sounds good
<anarsoul> Xalius: you want mesa from git master for lima
<anarsoul> don't try 19.3
<alyssa> robher: warned me about that :)
<Xalius> yeah I'm only a couple days behind master
<anarsoul> couple days behind is fine
<anarsoul> although we merged some fixes
<anarsoul> and some are pending
<anarsoul> e.g. polygon offset fixes were merged recently - you'll get flickering shadows in q3a without it
steev has quit [Ping timeout: 252 seconds]
ezequielg has quit [Read error: Connection reset by peer]
marex-cloud has quit [Ping timeout: 252 seconds]
robher has quit [Ping timeout: 245 seconds]
jstultz has quit [Ping timeout: 245 seconds]
warpme_ has quit [Ping timeout: 272 seconds]
anarsoul|c has quit [Ping timeout: 268 seconds]
anarsoul|c has joined #panfrost
warpme_ has joined #panfrost
jstultz has joined #panfrost
robher has joined #panfrost
<Xalius> I tried openarena quickly and that looked ok
ezequielg has joined #panfrost
<alyssa> is there any reason to tile formats that aren't bpp1/2/4/8
<alyssa> currently we also support bpp3/6/12/16 but maybe those should just be linear.
<anarsoul> alyssa: what about etc?
ezequielg has quit [Ping timeout: 260 seconds]
Xalius has quit [Ping timeout: 265 seconds]
bbrezillon has quit [Ping timeout: 272 seconds]
bbrezillon has joined #panfrost
Xalius has joined #panfrost
stikonas has joined #panfrost
indy has joined #panfrost
raster has quit [Quit: Gettin' stinky!]
<anarsoul> alyssa: it crashes right away on lima
<anarsoul> double free or corruption (out)
<anarsoul> but you can see it in CI
<anarsoul> even for panfrost
davidlt has quit [Ping timeout: 272 seconds]
buzzmarshall has quit [Remote host closed the connection]
Xalius has quit [Quit: Leaving]
raster has joined #panfrost
icecream95 has joined #panfrost
raster has quit [Client Quit]
raster has joined #panfrost
<alyssa> anarsoul: Hngh
<alyssa> anarsoul: Can't repro locally :V
<anarsoul> alyssa: try deqp?
Elpaulo has joined #panfrost
<alyssa> I am...
<alyssa> seems fine here ..
<anarsoul> but it fails in CI
<alyssa> i see that
<alyssa> what test cases are failing
stikonas has quit [Remote host closed the connection]
<anarsoul> alyssa: most of them? :)
<alyssa> I'm not seeing any issues here.
stikonas has joined #panfrost
<alyssa> and I can't debug what I can't reproduce.
<anarsoul> I'll look into it later
<anarsoul> alyssa: you also may want to run it through valgrind
alpernebbi has quit [Quit: alpernebbi]
raster has quit [Quit: Gettin' stinky!]
warpme_ has quit [Quit: Connection closed for inactivity]
<anarsoul> btw -O3 alone cuts 15 seconds (out of 3min 35sec) of functional.texture.* tests
<anarsoul> alyssa: why did you change src_x to x in panfrost_access_tiled_image_generic()?
NeuroScr has joined #panfrost
<anarsoul> alyssa: so in lima we're using staging buffer for transfer, and it's dimensions are box that's passed to lima_transfer_map()
<anarsoul> alyssa: you basically forgot that src dimensions are not the same as dst
<anarsoul> however it's buggy (doesn't do tiling/untiling correctly)
anarsoul|c has quit [Quit: Connection closed for inactivity]
<anarsoul> also "fast path" doesn't seem to be so fast, there's negligible difference in functional.texture.* run time (3m20s vs 3m19s)
<alyssa> anarsoul: the fast path is specifically for bpp4
<alyssa> texture.* tests a lot of everything
<anarsoul> fair enough
<alyssa> anarsoul: "you basically forgot that src dimensions ... " this was intentionally changed.
<alyssa> the attached gist is what I had before but that was obviously buggy
<anarsoul> then I guess you have to fix users? :)
<alyssa> I did?
<alyssa> The users are internal to that file
<alyssa> you changed the users
<alyssa> (in the gist)
<alyssa> - void *dst_origin = (void *) ((uint8_t *) (dst) - y * src_stride - x*bpp);
<anarsoul> yet you get invalid pointer with this ptr arithmetic
<alyssa> ^ ...that should be dst_stride
<alyssa> anarsoul: Correct.
<alyssa> That's okay -- we're changing the semantics of these (completely file-internal) routines
<alyssa> and then x,y,w,h is interpreted as a valid region, anything outside that is out-of-bounds
<alyssa> Your static analyzer won't pick up on that but it doesn't make the logic wrong, and it simplifies stuff
<alyssa> Perhaps that's an *ugly* solution but. in theory it is fine
<alyssa> I think the dst_stride thing is the problem, running through CI
<anarsoul> yeah, fixing dst_origin fixes the crash
<anarsoul> let me try it on weston
<alyssa> The alternative btw is to add crazy offsets to the _generic calls for the borders. Which it sounds like you would be more comfortable with (since it gets rid of the invalid pointers, even though they won't actually be accessed)
NeuroScr has quit [Quit: NeuroScr]
<alyssa> I'm fine with either solution
NeuroScr has joined #panfrost
<anarsoul> alyssa: I think the issue is that you're trying to combine src_x/src_y and dst_x and dst_y
<anarsoul> I think it'll improve readability
<anarsoul> pointer arithmetic is error-prone
<anarsoul> uh, I guess I missed one sentence :)
raster has joined #panfrost
<anarsoul> just use src_x/src_y/dst_x/dst_y args
raster has quit [Quit: Gettin' stinky!]
<icecream95> GALLIUM_HUD works fine with SFBD...
Depau has quit [Ping timeout: 265 seconds]
Depau has joined #panfrost
<alyssa> icecream95: Probably because sRGB is totally nop'd out for SFBD so there's nothing to expose ---> nothing to break? :P
<icecream95> MALI_MFBD_FORMAT_SRGB doesn't seem to do anything
<icecream95> The only values of mali_rt_format.flags that seems to have any affect is (bit1 ^ bit2) giving a black window
tgall_foo has quit [Ping timeout: 268 seconds]
Depau has quit [Ping timeout: 258 seconds]