#panfrost on 2020-08-17 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:15 raster has quit [Quit: Gettin' stinky!]

01:26 stikonas has quit [Remote host closed the connection]

01:40 kaspter has joined #panfrost

03:07 tgall_fo_ is now known as tgall-foo

03:07 tgall-foo is now known as tgall_foo

05:37 kaspter has quit [Ping timeout: 246 seconds]

05:38 kaspter has joined #panfrost

05:38 icecream95 has joined #panfrost

06:51 davidlt has quit [Ping timeout: 264 seconds]

07:01 guillaume_g has joined #panfrost

07:50 raster has joined #panfrost

08:11 karolherbst has quit [Quit: duh 🐧]

08:15 karolherbst has joined #panfrost

08:15 karolherbst has quit [Client Quit]

08:16 karolherbst has joined #panfrost

08:38 Elpaulo has quit [Read error: Connection reset by peer]

08:39 Elpaulo has joined #panfrost

08:40 kaspter has quit [Ping timeout: 265 seconds]

08:41 kaspter has joined #panfrost

08:42 andrey-konovalov has joined #panfrost

08:50 <warpme_> guys - just quick Q regarding current mesa master on g31: since some time it is non-functional. xorg says: (EE) modeset(0): Failed to initialize glamor at ScreenInit() time. Is this because WiP or rater regression?

09:05 <tomeu> warpme_: I think it should work with PAN_MESA_DEBUG=bifrost?

09:09 stikonas has joined #panfrost

09:10 ezequielg has quit [Ping timeout: 260 seconds]

09:16 ezequielg has joined #panfrost

09:40 kaspter has quit [Ping timeout: 240 seconds]

09:41 kaspter has joined #panfrost

10:14 davidlt has joined #panfrost

10:21 davidlt has quit [Read error: Connection reset by peer]

10:23 davidlt has joined #panfrost

10:24 <warpme_> tomeu: of course i have in env PAN_MESA_DEBUG=bifrost non-working bifrost is since few weeks. tests I do is simply recompile different versions of mesa sources. tested today with current master....

10:25 <tomeu> ah, ok, maybe you could bisect it?

10:25 <tomeu> hope we'll have bifrost in CI soon

10:31 <warpme_> tomeu: i can play with this but probably in next days as now trying to nail 5.8 kernel regression causing non-booting amlogic sm1 :-(

10:37 <macc24> warpme_: does glxgears work on g31?

10:39 <warpme_> nope. I can get working Xorg. glamour can't initialise....

10:39 <warpme_> nope. I can't get working Xorg. glamour can't initialise....

10:39 <macc24> hmm

10:39 <macc24> i have no idea

10:39 <warpme_> issue started something 3 weeks ago...

10:39 <macc24> have you tried with wayland compositor?

10:40 <warpme_> no as I don't use wayland (yet)

10:56 yann has joined #panfrost

10:57 Lyude has quit [Ping timeout: 246 seconds]

11:15 ente has quit [Remote host closed the connection]

11:21 nlhowell has quit [Ping timeout: 240 seconds]

11:35 ente has joined #panfrost

11:41 raster has quit [Remote host closed the connection]

11:53 icecream95 has quit [Quit: leaving]

11:59 <alyssa> So I spent 3am on Saturday thinking about cache maintenance

11:59 <alyssa> Maybe my subconscious is telling me something about Panfrost performance...

11:59 <alyssa> ("Did you have any eureka moments?" "Uhhhh... it was 3am.")

12:03 Lyude has joined #panfrost

12:15 robmur01_ is now known as robmur01

12:16 <alyssa> Basically trying to figure out how we manage to have incredible CPU overhead from memory management, and still OOM on the regular ;-;

12:18 indy has quit [Read error: Connection reset by peer]

12:18 <robmur01> we shouldn't be doing much cache maintenance CPU-side, since we remap everything as non-cacheable anyway

12:19 <alyssa> robmur01: buffer object cache, I mean

12:19 <alyssa> We've evolved some... interesting mechanisms here

12:20 <robmur01> well that's just unfairly misleading :P

12:20 <alyssa> First we created BOs on demand and free'd, and everything was simple, and super slow since BO creation is so slow.

12:20 <alyssa> So we added a BO cache to keep freed objects around in userspace.

12:20 <alyssa> Except then that ate exorbitant amounts of memory and led to frequent OOMs.

12:20 <alyssa> So we added madvise() to the kernel.

12:21 <alyssa> So now instead of a quick OOM freeze, the machine just locks up and gets insanely slow as there's a back-and-forth between userspace repopulating the cache and the kernel freeing it.

12:21 <alyssa> Meanwhile this means for each per-frame BO, we have 5 ioctls

12:21 <alyssa> wait_bo, madvise, mmap, munmap, madvise

12:22 indy has joined #panfrost

12:22 <alyssa> So allocating a BO is slow again, which matters when we need to allocate memory every frame for e.g job structures

12:23 <alyssa> (Or possibly worse - main memory backing varyings, which is GPU R/W but CPU invisible but still needs us to manage.)

12:24 <alyssa> IIRC kbase played some "interesting" games with the cache to optimize that. Minimally we should skip the mmap/munmap there.

12:24 <robmur01> is part of the problem that BOs are not necessarily consistent, such that you can have loads sat in the cache, none of which are quite the right size for the thing you need right now?

12:24 <warpme_> alyssa: yeah. such eureka moments usually are subconscious thing. I had them many times. usually: i have heavy problem to solve. days/weeks without solution. so I quit for some time. and usually in very unexpected moment I have just flash thought. bingo. it solves! at brain level it well explained: even when i quit thinking on subject, subconscious is not suspended and still "works" on problem.

12:24 <warpme_> alyssa: yeah. such eureka moments usually are subconscious thing. I had them many times. usually: i have heavy problem to solve. days/weeks without solution. so I quit for some time. and usually in very unexpected moment I have just flash thought. bingo. it solves! at brain level it is well explained: even when i quit thinking on subject, subconscious is not suspended and still "works" on problem.

12:25 <alyssa> robmur01: Possibly? It's just a lot of memory in either case. All the job structures are small but add up quickly with thousands of draws per second.

12:26 <alyssa> Varying memory is large since that's proportional to vertex count (*after* instancing).

12:26 <alyssa> warpme_: Indeed.

12:29 raster has joined #panfrost

12:30 <robmur01> I wonder if it's worth being more selective about what gets cached, i.e. favour fixed-size descriptors that will definitely be reused quickly, but maybe don't bother hanging on to flexible-sized things

12:31 <alyssa> robmur01: oh and those heinous CoW faults -.-

12:31 <alyssa> endrift: "mmapped but not read in from disk" not sure how that fits in a gfx stack, not sure I want to know

12:31 <alyssa> robmur01: Perhaps.

12:32 <alyssa> following the sysprof further, it's routing to drm_gem_shmem_fault/vm_insert_page

12:33 <alyssa> Oh, does that mean it's mmaped() but not actually mapped yet?

12:35 <robmur01> probably - that's likely to be "read in from disk" part for things that aren't 'real' files

12:37 <robmur01> see MAP_POPULATE (I think)

12:37 <alyssa> robmur01: gets back to my point that munmapping/madvising things in cache has higher CPU overhead than we thought

12:38 <alyssa> and doesn't actually solve OOMs in practice, only slows them down

12:38 <alyssa> better question perhaps is why we're prone to OOMs in the first place, 4GB of memory should be plenty

12:39 <robmur01> are we talking kernel task-killing OOMs, or just failure to allocate new buffers?

12:39 <robmur01> the latter is more likely CMA exhaustion than 'real' OOM

12:40 <alyssa> all of the above

12:40 <alyssa> system becoming super slow as madvise reclaims memory and userspace fights for it back, userspace winning out and freezing the system requiring a hard reboot

12:41 <alyssa> I guess what I don't understand is where all this memory usage is coming from. Are we leaking terribly?

12:42 <alyssa> That's probably the first question.

13:04 <warpme_> dears - regarding non-working current mesa master on g31 bisecting:

13:04 <warpme_> https://www.irccloud.com/pastebin/y8rqxV6t/

13:05 <warpme_> https://www.irccloud.com/pastebin/luc44wjR/

13:08 kaspter has quit [Quit: kaspter]

13:09 <alyssa> warpme_: Ah :|

13:09 <alyssa> chrisf: ^ saw something like that IIRC

13:10 <warpme_> alyssa: :-p

13:10 <alyssa> warpme_: hm?

13:12 <robmur01> I guess the upshot of 96fa8d70 is that we end up exposing fewer capabilities for Bifrost regardless of the debug flags

13:15 <robmur01> so I suppose either the unsupported features were "working" in the sense of not blowing up horribly, or the client doesn't actually use them but still throws a tantrum about them not being present.

13:19 kaspter has joined #panfrost

13:27 <alyssa> ^^ that

13:29 <bbrezillon> alyssa: regarding your OOM issue, did you try disabling the BO cache?

13:29 <bbrezillon> just to see if the memory consumption keeps growing without it

13:33 <alyssa> bbrezillon: it's not one OOM issue, it's just something that happens a lot while dogfooding panfrost

13:33 <alyssa> no easy repro other than "use the machine for 8 hours"

13:33 <alyssa> or in the case of the 3gb veyron, "open GNOME, Zoom, Chromium, and Firefox at the same time" :p

13:34 <urjaman> 3GB?

13:34 <alyssa> i thought so?

13:36 <urjaman> there are 2GB and 4GB models

13:36 <alyssa> somewhere in bwteen then ;p

13:36 <bbrezillon> alyssa: yes, but I'm wondering if disabling the cache wouldn't help reproducing the problem more quickly

13:37 <alyssa> urjaman: /proc/meminfo says 2GB

13:37 <alyssa> so that explains that issue :p

13:38 kaspter has quit [Quit: kaspter]

13:39 <urjaman> "MemTotal: 4032536 kB" for the 4GB model (i mean yeah it isnt quite the full 4GB which isnt surprising on a 32bit SoC, needs to have the registers somewhere)

13:43 <alyssa> nods

14:11 <alyssa> The allocation strategy for varyings is really not clear to me either.

14:11 <alyssa> Maybe having a separate invisible pool would do it.

14:23 <tomeu> alyssa: I thought that the slow part of BO creation was the mmap

14:23 <alyssa> tomeu: it's both mmap and create

14:24 <alyssa> bo cache only helps with the latter, and adds overhead in the form of madvise and wait

14:25 <tomeu> hrm, what could be slow in create besides the mmap?

14:25 <alyssa> er, by mmap I mean CPU-side mmap()

14:25 <alyssa> I guess by create I mean GPU-side mmap, but that's not mmap() from userspace perspective

14:26 nlhowell has joined #panfrost

14:26 <tomeu> ah yeah

14:27 <tomeu> wonder why the GPU-side mmap needs to be so slow

14:27 <tomeu> robmur01: do you know?

14:27 <alyssa> regardless it is, hence the BO cache

14:29 <tomeu> well, but what if we could make it fast enough to not need a BO cache? :)

14:29 <alyssa> we'd still have old kernels to worry about ;)

14:30 <robmur01> hmm, mapping stuff into the GPU pagetables shouldn't be slow, there's very little to it

14:30 <alyssa> anyways, if I split to two pools, one CPU-accessible, one GPU-only, that helps a bit

14:30 <alyssa> since that saves the mmap/munmap on the GPU-only side.

14:30 <alyssa> (which varyings are)

14:31 <tomeu> ah, cool

14:36 <alyssa> anyways, if the BO cache is disabled, we spend a huge amount of time in panfrost_mmu_map

14:36 <alyssa> a big chunk of which is in shmem_getpage_gfp

14:36 marcodiego has joined #panfrost

14:36 <alyssa> drm_gem_get_pages is just expensive ig

14:38 <tomeu> maybe we need a new data structure for the pages

14:41 <robmur01> shmem_getpage_gfp()... there's rather a lot of that function. I hope we're not making a separate call for each individual 4KB page :(

14:44 <alyssa> Interesting bo_cache_fetch itself has some weirdly high overhead

14:45 <alyssa> perf blames a load instruction in the linked list iteration

14:45 <alyssa> guess we're thrashing the cache

14:50 <alyssa> Might be worth trying out util_sparse_array-free_list instead

14:51 <alyssa> (I know I went thru this exercise a few months ago..)

14:54 raster has quit [Quit: Gettin' stinky!]

15:00 nlhowell has quit [Ping timeout: 264 seconds]

15:03 <nhp[m]> So, not a strictly panfrost related question, but is it possible to have mesa compile a shader for you ahead of time for an arbitrary target?

15:08 davidlt has quit [Ping timeout: 256 seconds]

15:20 raster has joined #panfrost

15:23 ente has quit [Remote host closed the connection]

15:26 ente has joined #panfrost

15:29 <tomeu> nhp[m]: guess you want to use glShaderBinary ?

15:30 <tomeu> I expect at least all drivers that support OpenGL 4.1 to also support it

15:33 <nhp[m]> issue is that I don't have a PC with the actual hardware, what I'm trying to do is build shaders for an old AMD GPU used in the WIi U

15:38 BenG83 has joined #panfrost

15:40 guillaume_g has quit [Quit: Konversation terminated!]

15:41 <chrisf> nhp[m]: are you running mesa on that device?

15:42 <nhp[m]> No, it has its own graphics API and no ability to compile shaders at runtime, so they must be compiled ahead of time

15:42 <nhp[m]> there's a very barebones shader compiler for it that uses AMD's shaderanalyzer, but it chokes on anything nontrivial, so I was wondering if it might be possible to have mesa do it

15:43 <chrisf> theoretically, sure; but it's going to take some hacking on your part, and you shouldnt assume that their graphics api has made the same decisions as mesa's r600 driver where there's multiple ways to do things

15:59 davidlt has joined #panfrost

16:00 kherbst has joined #panfrost

16:01 <alyssa> nhp[m]: ^^ that

16:01 karolherbst has quit [Disconnected by services]

16:01 kherbst is now known as karolherbst

16:01 <alyssa> there's an offline shader compiler API ("standalone") in mesa, panfrost implements it

16:02 <alyssa> so you could implement that for AMD, but a lot of shader compiler decsisions are arbitrary and depend on decisions on the command stream side

16:02 <alyssa> given the drivers on the other side are proprietary.... :(

16:03 gcl_ has joined #panfrost

16:03 <nhp[m]> Yeah...it's probably not worth the effort :(

16:04 gcl is now known as Guest66166

16:04 gcl_ is now known as gcl

16:04 <alyssa> if you could boot linux and use mesa otoh... :~)

16:05 <nhp[m]> There's a Linux port for it actually, no GPU driver yet though :(

16:05 <alyssa> Now that sounds like a fun project ;)

16:05 <alyssa> #dri-devel is that way, enjoy

16:05 <alyssa> https://xkcd.com/356/

16:06 Guest66166 has quit [Ping timeout: 240 seconds]

16:07 <nhp[m]> heh, sorry for being totally offtopic here, just figured someone here might be able to point me in the right direction :)

16:08 <alyssa> and we did ;)

16:08 <alyssa> c'mon, it'll be fun

16:08 <alyssa> :P

16:09 <nhp[m]> Heh, even rendering things with GX2 is enough suffering for me already lol

16:11 <nhp[m]> But still thanks for the help :)

16:15 <alyssa> ;)

16:26 tomboy65 has quit [Remote host closed the connection]

16:32 remexre has joined #panfrost

16:35 raster has quit [Quit: Gettin' stinky!]

16:41 raster has joined #panfrost

16:42 tomboy65 has joined #panfrost

17:09 raster has quit [Quit: Gettin' stinky!]

17:19 BenG83 has quit [Ping timeout: 240 seconds]

17:20 raster has joined #panfrost

17:29 davidlt has quit [Ping timeout: 256 seconds]

18:21 nlhowell has joined #panfrost

18:35 nlhowell has quit [Ping timeout: 256 seconds]

19:03 <endrift> alyssa: point taken

19:04 <alyssa> line taken

19:25 anarsoul has quit [Read error: Connection reset by peer]

19:26 anarsoul has joined #panfrost

19:40 raster has quit [Quit: Gettin' stinky!]

19:50 megi has joined #panfrost

20:02 buzzmarshall has joined #panfrost

20:07 raster has joined #panfrost

20:32 nlhowell has joined #panfrost

20:50 unoccupied has quit [Ping timeout: 256 seconds]

20:51 TheMojoMan has joined #panfrost

20:55 TheMojoMan has quit [Client Quit]

21:15 <bbrezillon> tomeu: ouch https://gitlab.freedesktop.org/bbrezillon/mesa/-/jobs/4125895

23:39 raster has quit [Quit: Gettin' stinky!]