#panfrost on 2020-08-12 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:01 unoccupied has quit [Ping timeout: 240 seconds]

00:18 unoccupied has joined #panfrost

00:43 daniels has quit [Excess Flood]

00:43 daniels has joined #panfrost

00:45 raster has quit [Quit: Gettin' stinky!]

00:58 stikonas has quit [Remote host closed the connection]

01:03 vstehle has quit [Ping timeout: 256 seconds]

01:36 kaspter has joined #panfrost

01:36 kaspter has quit [Excess Flood]

01:37 kaspter has joined #panfrost

03:18 davidlt has joined #panfrost

03:35 rhyskidd has joined #panfrost

04:04 camus1 has joined #panfrost

04:05 kaspter has quit [Ping timeout: 264 seconds]

04:05 camus1 is now known as kaspter

04:15 davidlt has quit [Ping timeout: 240 seconds]

04:55 kaspter has quit [Ping timeout: 260 seconds]

04:55 camus1 has joined #panfrost

04:57 camus1 is now known as kaspter

05:00 vstehle has joined #panfrost

05:06 awordnot has quit [Read error: Connection reset by peer]

05:07 awordnot has joined #panfrost

05:20 kaspter has quit [Ping timeout: 264 seconds]

05:20 kaspter has joined #panfrost

05:21 davidlt has joined #panfrost

05:32 _whitelogger has joined #panfrost

05:40 <tomeu> alyssa: the problem is that, at the highest frequency in rk3399 op1, the initial voltage isn't enough

05:41 <tomeu> as an alternative to switching to a kernel that can change the voltage, you can use this patch: https://gitlab.freedesktop.org/tomeu/linux/-/commit/854a9ee2a93fd716e57eaf98f3b9daae2100565e

05:41 <tomeu> which basically removes the 800mhz level

06:34 <bbrezillon> robmur01: Hi! I've been chasing an issue we have on s922 (amlogic) (tomeu, narmstrong and/or chewitt probably reported it here a while ago) where the first few jobs we start on a new GL context fail with faults (DATA_RANGE_FAULT, TILE_RANGE_FAULT, ...)

06:39 <bbrezillon> things stabilize after a while, but I found out that disabling the BO cache in panfrost make things worse (it basically faults on every BO we pass, unless it's already been passed to a previous job)

06:41 <bbrezillon> after further investigation it seems to be cause by the shareability attribute when adding pages to the page table

06:42 <bbrezillon> when I force it to non-shareable (instead of inner-shareable), the faults disappear (and that's also what mali_kbase/libmali seem to use), but I'm not sure I understand what happens here

07:58 raster has joined #panfrost

08:24 davidlt has quit [Ping timeout: 260 seconds]

08:34 nlhowell has joined #panfrost

09:05 nhp[m] has quit [Quit: killed]

09:05 clementp[m] has quit [Quit: killed]

09:05 Ke has quit [Quit: killed]

09:05 l-as has quit [Quit: killed]

09:11 stikonas has joined #panfrost

09:12 l-as has joined #panfrost

09:25 icecream95 has joined #panfrost

09:30 clementp[m] has joined #panfrost

09:30 Ke has joined #panfrost

09:30 nhp[m] has joined #panfrost

09:46 paulk-leonov has quit [Ping timeout: 240 seconds]

09:48 paulk-leonov has joined #panfrost

09:53 <icecream95> raster: It was probably 40b99bb79e1 ("panfrost: Revert "Disable frame throttling"") in Mesa that improved things for you

09:54 <raster> icecream95: not a kernel change?

09:54 <raster> this was on my list of annoyances to look into...

09:57 <icecream95> There's still no real scheduling, but at least GPU-heavy applications don't fill the job queue with too many jobs anymore

10:13 <icecream95> alyssa: With AFBC, glmark2-es2 -b texture is probably still slower than before 528e132d4f7

10:13 <robmur01> bbrezillon: that all seems to chime with the working theory of (at least some part of) the cache being f'ed

10:14 <robmur01> shareablility may well affect how things allocate into the caches in the first place

10:26 raster has quit [Ping timeout: 246 seconds]

10:30 <bbrezillon> robmur01: I tried invalidating/flushing the MMU and L2 caches agressively, but still had the issue

10:31 <bbrezillon> so I'm wondering what in this inner shareability domain could influence the cache entries

10:32 <robmur01> bbrezillon: my gut feeling is that allocating into L2 is most likely the problem, so invalidating is liable to make it worse ;)

10:32 <bbrezillon> (again, not sure what in the inner domain in that case, and I won't pretend I get all the subtelties of the shareability concept)

10:32 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

10:35 <robmur01> it seems plausible that NS might mean bypassing (shared) L2 and allocating directly into L1 (and possibly subsequent evictions from L1 out to L2 might not be broken)

10:36 jernej has joined #panfrost

10:36 <bbrezillon> hm, ok, so you think it's an issue that only impacts amlogic

10:37 <robmur01> yup, I'm fairly convinced it's some issue in their integration

10:38 <robmur01> if I could un-break my Juno I'd flash a G52 onto it ;)

10:38 * robmur01 might have to brave the faff of trying to book a trip into the office

10:43 <bbrezillon> robmur01: hm, my bad, mainlines set the outer-shareable attribute, not inner-shareable

10:45 <robmur01> note that there's more awkwardness with shareability - for Midgard "LPAE" the inner domain is essentially "everything in the GPU" while the outer domain is "the rest of the system"

10:45 <bbrezillon> which is called "SHARED_BOTH" in mali_kbase BTW

10:46 <bbrezillon> hm, ok

10:46 <robmur01> however with AArch64 format the meanings are changed to work more like VMSA

10:47 <robmur01> I *think* it becomes more like inner = shader core and outer = whole GPU

10:47 <bbrezillon> how do you think we should fix that?

10:47 <bbrezillon> 1. add a IO_PGTABLE_QUIRK_ARM_SH_NS flag

10:47 <bbrezillon> 2. use AArch64 tables

10:48 <bbrezillon> actually, I didn't check what's used when running in AArch64 mode

10:49 kaspter has quit [Ping timeout: 256 seconds]

10:50 kaspter has joined #panfrost

10:50 raster has joined #panfrost

10:54 <robmur01> it probably makes sense to hook up AArch64 properly for Bifrost, which might mean some fiddling with io-pgtable attributes to match kbase

11:05 <tomeu> when I tried AArch64, I saw the same issues

11:05 <tomeu> fwiw :)

11:15 <bbrezillon> tomeu: do you have a branch to share?

11:15 <tomeu> don't think so, let me check in reflog

11:16 <tomeu> ah, found it :)

11:17 <tomeu> bbrezillon: snippets

11:17 <tomeu> ahem

11:17 <tomeu> https://gitlab.freedesktop.org/snippets/1138

11:18 <robmur01> BTW, is the GPU revision in S922 r0px or r1px?

11:20 <bbrezillon> robmur01: "GPU identified as 0x2 arch 7.2.1 r0p0 status 0"

11:21 <robmur01> cool, thanks

11:22 <robmur01> (apparently that generation of GPUs played the nasty Cortex-A9 trick of having significant functional differences between revisions)

11:35 <icecream95> robmur01: rvgl is working great for me - the only problem I have is stuttering during shader compilation, but after the first lap everything is smooth

11:36 <tomeu> hmm, wonder how hard it would be to add disck cache support to Panfrost

11:42 <robmur01> icecream95: on RK3399 (using the binary slurped out of the odroid arm64 .deb) occasionally black cones appear on the faces of all the wheels, and the sky on the stunt arena glitches between textured and black

11:45 <robmur01> (also it segfaults a fair bit, and my Xbox 360 controller has an annoying tendency to pull to the left... might be time to dig out the Dreamcast for some 'proper' Re-Volt again :D)

11:50 davidlt has joined #panfrost

11:57 <icecream95> robmur01: What Mesa verison are you using? The "black cones" on wheels and the stunt arena sky have been fixed for months...

12:02 <robmur01> hmm, it *should* be git master, but I suppose it's possible that SDL somehow gets around my ld.so.conf trick and gets the distro mesa instead - I'll double-check

12:16 raster has quit [Remote host closed the connection]

12:17 raster has joined #panfrost

12:21 <robmur01> nope, just built master as of right now, hacked panfrost_model_name() to verify in-game that it's picking up the right mesa, and the glitches are very much still there

12:30 <robmur01> (if it matters, I'm using ES1.1 mode without shaders)

12:35 <robmur01> bbrezillon: actually, there is another possible reason for wonky cache behaviour...

12:37 <robmur01> can you try adding "dma-coherent;" to the DT node and implementing the equivalent of this old hack: http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=e6d10fe5f8ab24144cc6efa20aba2c40eb8b8928

12:38 <robmur01> downstream implies that this guy is actually I/O-coherent: https://github.com/khadas/linux/blob/khadas-vim3-pie/arch/arm64/boot/dts/amlogic/mesong12a-bifrost.dtsi

12:39 <icecream95> robmur01: I'm using GL3 mode (Profile=0 in rvgl.ini, and PAN_MESA_DEBUG=gl3 mesa_glthread=true)

12:39 <robmur01> the pgprot_writecombine() mapping means who knows what stale crap is sat (clean) in the CPU cache for snoops to hit

12:40 <robmur01> non-shareable should happen have the side effect of making non-snooping accesses thus having no possibility of inadvertently hitting stale CPU cache lines

12:42 <bbrezillon> robmur01: the GPU reports ACE-Lite support

12:42 <bbrezillon> but mali_base ignores it and set the coherency reg to non-coherent

12:42 <bbrezillon> I'm trying your suggestion

12:45 <robmur01> another trick is to run some kind of memory benchmark/test in the background to thrash the CPU cache and make sure BOs don't get a chance to hang around in there

12:46 <robmur01> if that visibly reduces the appearance of faults it would point strongly to this

12:46 <robmur01> (this is pretty much what I was doing with Juno last year)

12:47 icecream95 has quit [Quit: leaving]

12:52 <bbrezillon> robmur01: hm, so you force ARM_LPAE_PTE_SH_IS ?

12:54 <robmur01> Midgard needs OS to emit snoops properly (which is what the patch does), but Bifrost may be different and do so anyway

12:54 <robmur01> (I'm trying to look that up ATM)

12:57 <bbrezillon> nope, it doesn't help

12:58 <bbrezillon> uh, wait

12:58 kaspter has quit [Quit: kaspter]

12:59 <bbrezillon> forgot to update the dtb

13:02 <bbrezillon> robmur01: nope, that's even worse, now I have translation faults

13:04 <robmur01> oh, you'd probably need the earlier patch to allow coherent table walks too - http://www.linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=92ef9018fbe1b412041af759a330122058da983e

13:04 <robmur01> then make sure cfg->coherent_walk gets set

13:05 <robmur01> Maybe I should try to update that branch properly...

13:12 raster has quit [Quit: Gettin' stinky!]

13:14 <bbrezillon> robmur01: nope, still failing

13:15 <bbrezillon> with INSTR_INVALID_ENC faults now

13:17 raster has joined #panfrost

13:17 <bbrezillon> I guess there's a good reason libmali disables the coherency and set the shareability attrib to NS

13:17 <bbrezillon> well, libmali+kbase

13:44 <alyssa> [/scroll goto -10

13:46 <alyssa> icecream95: reality check, thanks, still need to figure out what to do about that

13:59 kaspter has joined #panfrost

16:52 <alyssa> /me thinks her ducks are in a row to refactor formats

16:54 <alyssa> I refactored everything so the { swizzle, format, sRGB, zero } 22-bits are all together in textures/attributes

16:54 <alyssa> so now need to redo the table mapping PIPE to MALI to actually map PIPE to { swizzle, MALI, sRGB, zero }, all packed at compile-time

16:55 <alyssa> then the explicit format swizzle handling goes away on <= v6 (Midgard, G71, G72)

16:56 <alyssa> when we start paying attention to v7 (later Bifrost, without HAS_SWIZZLES quirk), we can do the same trick, just without the full swizzle (only "swap r/b" etc bits)

16:56 <alyssa> So then we'll end up with two lookup tables depending on version, things are a lot cleaner, less runtime work too :)

16:58 <alyssa> table itself can be done compactly with some macros, and also some python slop to ingest our current source code + Gallium format list and help do the generation

17:32 <robmur01> BTW kmscube no longer works on RK3399 at the moment (mesa master, kernel 5.8) - "failed to set mode: Invalid argument" - will that be the AFBC modifier thing?

17:34 <robmur01> (plus a big old pile'o'warnings from mesa about "failed to remap gl<blah>NV")

17:40 <alyssa> robmur01: current master only uses AFBC for internal textures/fbos, anything shared will still be linear (or u-interleaved tiled)

17:42 <alyssa> kmscube wfm

17:42 <alyssa> maybe need to force one of `-D /dev/dri/card{0, 1}`?

17:46 <robmur01> nope, definitely the right device - it goes through all the normal blurb up to "using modifier fff...f" before failing

17:46 <alyssa> :|

17:46 <robmur01> I wonder what else could be different on my system (Arch)... libdrm perhaps?

17:55 kaspter has quit [Quit: kaspter]

18:13 narmstrong_ has joined #panfrost

18:13 nhp_ has joined #panfrost

18:20 Ke has quit [*.net *.split]

18:20 narmstrong has quit [*.net *.split]

18:20 cyrozap has quit [*.net *.split]

18:20 nhp has quit [*.net *.split]

18:20 narmstrong_ is now known as narmstrong

18:21 clementp[m] has quit [Remote host closed the connection]

18:21 l-as has quit [Read error: Connection reset by peer]

18:21 nhp[m] has quit [Write error: Connection reset by peer]

18:29 nhp[m] has joined #panfrost

18:45 l-as has joined #panfrost

18:45 Ke has joined #panfrost

18:45 clementp[m] has joined #panfrost

19:50 davidlt has quit [Ping timeout: 240 seconds]

20:08 raster has quit [Quit: Gettin' stinky!]

20:11 buzzmarshall has joined #panfrost

21:43 unoccupied has quit [Quit: WeeChat 2.8]

22:01 raster has joined #panfrost

23:05 nlhowell has quit [Ping timeout: 265 seconds]

23:39 * alyssa poking at invalidate_resource

23:39 <alyssa> breaks webgl somehow

23:42 <alyssa> invalidate_resource getting called but we're having to wallpaper anyway

23:42 <alyssa> Probably some race between our batch tracking and global gallium, um

23:53 raster has quit [Quit: Gettin' stinky!]