#panfrost on 2021-02-01 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:20 uis has joined #panfrost

00:30 raster has joined #panfrost

00:54 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

00:55 uis has joined #panfrost

01:06 <macc24> i think the real panfrost are the friends we made along the way

01:09 stikonas has quit [Remote host closed the connection]

01:12 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

01:12 uis has joined #panfrost

01:14 <alyssa> macc24: ?

01:15 <urjaman> <3

01:16 <alyssa> icecream95: You're the real Panfrost, apparently.

01:19 raster has quit [Quit: Gettin' stinky!]

01:20 jschwart has quit [Ping timeout: 260 seconds]

01:26 <icecream95> alyssa: I guess you're the real Panfrost too?

01:26 jschwart has joined #panfrost

01:27 <HdkR> hmmm

01:41 * alyssa wonders why GNOME's "Night Light" is broken on Panfrost+Rockchip

01:44 <macc24> alyssa: wayland?

01:44 <alyssa> Yeah

01:44 <macc24> lemme check on mediatek

01:45 <icecream95> gammastep is working here

01:45 <icecream95> It also worked on RK3288

01:46 <icecream95> (I can't check with Gnome, because I no longer have a working install)

01:47 <macc24> after i install gnome...

01:50 <icecream95> alyssa: If I grep drivers/gpu/drm/rockchip/rockchip_vop_reg.c for lut_size, only RK3288 has it

01:50 <alyssa> Boo :(

01:51 <macc24> icecream95: whats lut_size

01:54 <cphealy> I just noticed that Weston recently made support for GL_EXT_unpack_subimage a requirement instead of optional: https://gitlab.freedesktop.org/wayland/weston/-/commit/593d5af43a8e2c2a3371088fa7ae430d0517c82d

01:55 <HdkR> Looks like that extension is unconditionally supplrted in mesa

01:55 <cphealy> When I look at the Panfrost driver, I see that this extension is supported. When I look at the ARM provided Mali driver, I've not found any instances where this extension is supported. Can anyone provide any insight into why the ARM provided Mali driver does not support this and why Panfrost/Mesa does?

01:55 <HdkR> supported*

01:57 <HdkR> Mali blob historically avoids adding features

01:58 <macc24> it's almost like mali blob avoids being actually useful for something other than tracing and rendering chromeos

01:58 <cphealy> haha, got it.

01:59 <alyssa> It can render Android too

01:59 <cphealy> HdkR, macc24: are there other notable examples of features that the Mali driver has avoided adding?

02:00 <HdkR> dual source blending is my goto :P

02:00 <macc24> cphealy: did you just think that i know anything about mali blob beyond "blob bad panfrost good"?

02:00 <cphealy> As I found recently, the ARM Mali driver for Linux/Wayland does not support "EGL_ANDROID_native_fence_sync" while the Android version of the ARM Mali driver does.

02:00 <alyssa> ANDROID

02:01 <HdkR> yea, android

02:01 atler has quit [Killed (orwell.freenode.net (Nickname regained by services))]

02:01 atler has joined #panfrost

02:01 <HdkR> Pretty sure mesa only exposes one ANDROID extension as a quirk

02:01 <macc24> GL_ANDROID_extension_pack_es31a

02:01 <HdkR> yep

02:02 <cphealy> I thought Panfrost exposed "EGL_ANDROID_native_fence_sync". Am I mistaken?

02:02 <HdkR> Because it was good to support that extension when there wasn't ES 3.2 available...

02:03 <HdkR> Panfrost should end up having it in an Android environment, but not X/Wayland

02:03 <alyssa> iirc weston wants it

02:04 vstehle has quit [Ping timeout: 256 seconds]

02:04 <HdkR> oh neat

02:04 <cphealy> Yes, Weston wants "EGL_ANDROID_native_fence_sync".

02:04 <macc24> ANDROID yields no hits in kmscube log when running on G72

02:05 <macc24> weston works fine though

02:06 <cphealy> macc24: sort of. Take a look at this weston code: https://gitlab.freedesktop.org/wayland/weston/-/blob/master/libweston/renderer-gl/egl-glue.c#L665

02:07 <cphealy> So, without "EGL_ANDROID_native_fence_sync", you get the following error and associated change in behaviour: "warning: Disabling explicit synchronization due to missing EGL_KHR_wait_sync extension\n"

02:07 <macc24> yep it's there

02:07 <macc24> on g72

02:07 <HdkR> wants versus has problems I guess

02:08 <cphealy> daniels on #wayland had the following to say about it: "yes, EGL_ANDROID_native_fence_sync is required, because without that we can't get a dma-fence FD to poll on, only an opaque EGLSyncKHR object which we have to constantly query EGL about; those aren't portable between contexts, so we can't pass them between client<->compositor (as well as KMS) either"

02:09 <cphealy> If it's there with Panfrost, we should be fine though and that warning should not show up when starting weston.

02:09 <cphealy> It's just an issue with the blob driver...

02:12 <daniels> Panfrost does expose it, as do all vaguely modern Mesa drivers

02:14 * alyssa thinks we're a CAP short

02:22 <alyssa> Passed: 34/36 (94.4%)

02:22 <alyssa> that's totally good enough right???

02:23 <macc24> alyssa: maybe try again and it will fix itself?

02:24 <alyssa> :p

02:24 <macc24> oh, and gnome is more usable on duet when gpu is set too 800mhz

02:25 <macc24> therefore it will fix itself and there will be some actual user interface that is tablet friendly :D

02:27 <alyssa> sway is tablet friendly

02:27 <alyssa> just picky about its friends

02:28 * macc24 notices number of loc changed in panfrost after git pull

02:28 <alyssa> ?

02:29 <macc24> i sure hope that there are no regressions

02:29 <alyssa> thanks for volunteering to debug

02:34 <macc24> uhhh

02:34 <macc24> did bifrost lose gl3.1?

02:35 <alyssa> shouldn't've but maybe accidentally

02:35 <macc24> then it did

02:35 <macc24> nvm was looking at es

02:38 <macc24> woo, chromium works on my chromebook

02:47 <alyssa> icecream95: re tiler faults, wonder if sharing the tiler heap across batches is the problem

02:48 <alyssa> if the tiler heap needs to be preserved for the fragment job, that's racey

03:09 <icecream95> alyssa: The faults happen even with PAN_MESA_DEBUG=sync

03:11 <anarsoul> alyssa: again, not sure if it's relevant for midgard/bifrost, but we have to preserve tiler heap for fragment job on utgard

03:13 karolherbst has quit [Ping timeout: 272 seconds]

03:13 <alyssa> :V

03:14 <alyssa> icecream95: bifrost only or also midg?

03:14 <alyssa> bifrost tiler heap start/free/end ptr management is hard to understand for me, maybe it's broken

03:14 <alyssa> bbrezillon: ^^

03:16 <anarsoul> alyssa: do you have debug flag to serialize jobs?

03:16 <anarsoul> if yes, it's worth to try reproducing the bug with this flag set

03:16 <anarsoul> icecream95: ^^

03:16 <alyssa> =sync

03:21 kaspter has joined #panfrost

03:25 <macc24> umm

03:25 <macc24> my cursor disappeared

03:25 <macc24> only in "normal mode"

03:26 <macc24> after playing some video games that hide the cursor in chromium

03:27 <macc24> icecream95: can you reproduce this? happened in firefox randomly too

03:45 <macc24> oh god

03:45 <macc24> now i'm seeing something that i thought i'd never see on arm device

03:47 <macc24> this is xonotic on high settings on duet, playable https://i.imgur.com/93kX9Kl.jpg

03:48 <macc24> scaling is at 1.3

03:50 camus has joined #panfrost

03:51 kaspter has quit [Ping timeout: 240 seconds]

03:51 camus is now known as kaspter

04:06 archetech has quit [Quit: Konversation terminated!]

04:37 davidlt_ has joined #panfrost

04:53 camus has joined #panfrost

04:53 kaspter has quit [Ping timeout: 264 seconds]

04:53 camus is now known as kaspter

05:55 chewitt has quit [Quit: Adios!]

06:00 vstehle has joined #panfrost

06:03 davidlt_ has quit [Ping timeout: 264 seconds]

06:04 davidlt has joined #panfrost

06:09 davidlt_ has joined #panfrost

06:09 davidlt has quit [Ping timeout: 246 seconds]

06:44 daniels has quit [Ping timeout: 260 seconds]

06:45 daniels has joined #panfrost

06:45 robher has quit [Ping timeout: 260 seconds]

06:46 narmstrong has quit [Ping timeout: 260 seconds]

06:46 robher has joined #panfrost

06:47 narmstrong has joined #panfrost

07:24 <bbrezillon> alyssa: heap.{base,size} are set to the heap BO address and size, and never changed by the GPU, the top/bottom fields are updated by the GPU when it allocates memory from the heap

07:24 <bbrezillon> so I don't think we do something wrong here

07:28 <bbrezillon> sharing the tiler heap is fine, as long as tiler job N+1 doesn't start before fragment job N, which was enforced by icecream95's patch (adding the BO to the batch should create an implicit dependency between the fragment and vertex/tiler job chains)

07:29 guillaume_g has joined #panfrost

07:30 <bbrezillon> if we want to make that explicit, we can call panfrost_add_bo(heap_bo, RW | VERTEX_TILER | FRAGMENT), but I doubt it will fix icecream95's issue

07:30 <bbrezillon> icecream95: can you share an apitrace?

07:33 <icecream95> bbrezillon: I've tried making apitraces, but replaying them doesn't reproduce the faults

07:36 <bbrezillon> :-(

07:37 camus has joined #panfrost

07:38 kaspter has quit [Read error: Connection reset by peer]

07:40 camus is now known as kaspter

07:40 kaspter has quit [Remote host closed the connection]

07:43 <bbrezillon> icecream95: can you share the kernel logs?

07:44 kaspter has joined #panfrost

07:46 <icecream95> bbrezillon: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8773#note_785163

07:56 _whitelogger has joined #panfrost

07:56 camus has joined #panfrost

07:56 <bbrezillon> icecream95: and 0x3E008080 is near/in the tiler heap?

07:57 kaspter has quit [Ping timeout: 264 seconds]

07:57 camus is now known as kaspter

07:58 <icecream95> bbrezillon: It's far above any allocated addresses

08:03 <bbrezillon> icecream95: which job fails (vertex, tiler or fragment)?

08:04 <bbrezillon> js=0, so it's a fragment job

08:09 <bbrezillon> icecream95: do you have a pandecode trace of the failing batch?

08:30 <icecream95> bbrezillon: plasmashell uses multiple render threads, so moving the tiler heap from being per-device to per-context would probably fix the faults

08:31 <bbrezillon> oh

08:32 <bbrezillon> that's indeed a problem I didn't think about

08:33 <icecream95> That would explain why it doesn't reproduce with apitrace

08:33 <bbrezillon> well, in theory that shouldn't be a problem

08:33 <bbrezillon> the kernel should do the right thing

08:34 <bbrezillon> we only share the tiler heap BO, not the tiler heap descriptor

08:34 <bbrezillon> right?

08:43 <bbrezillon> I mean, as long as the heap BO is passed to the job submission request, it should force the drm scheduler to serialize jobs using this heap

08:44 <bbrezillon> unless there's a race somewhere...

08:45 <icecream95> bbrezillon: It could be: context 1 writes to tiler_heap, context 2 writes to tiler_heap, context 1 reads from tiler_heap from three different jobs

08:47 nlhowell has joined #panfrost

08:52 <icecream95> Making tiler_heap per-context did seem to fix the faults

08:54 <icecream95> bbrezillon: This might be unrelated, but is it supposed to be possible for jobs in different processes to interfere like this? https://gitlab.freedesktop.org/-/snippets/1525

09:08 kaspter has quit [Ping timeout: 246 seconds]

09:10 <bbrezillon> icecream95: the MMU should prevent that, but I guess a use-after-free bug could cause that

09:11 <bbrezillon> so, tiler heap is only accessible from the GPU, and jobs accessing the same BO are supposed to be serialized by the kernel driver

09:12 <bbrezillon> but there's indeed a race because vertex/tiler and fragment jobs are issued separately

09:12 <bbrezillon> got it

09:14 <bbrezillon> so tiler_job 1 from context 2 might be sceduled between tiler_job 1 and fragment job 1 from context 1

09:15 kaspter has joined #panfrost

09:21 <bbrezillon> icecream95: BTW, I think we should have a panfrost_add_bo(tiler_heap, RW | VERTEX_TILER | FRAGMENT), just to make it clear that we really use the BO from both fragment and vertex/tiler job chains

09:23 dstzd has quit [Quit: ZNC - https://znc.in]

09:43 kaspter has quit [Ping timeout: 258 seconds]

09:44 kaspter has joined #panfrost

09:53 karolherbst has joined #panfrost

10:04 stikonas has joined #panfrost

10:14 <robmur01> amonakov: the reason I'm dismissive of "datasheets?" being repeatedly asked every couple of months is that it's already been said that there is no public documentation, there never will be, and the notion of a nice manual that explains everything doesn't even exist

10:15 <robmur01> the GPU TRMs in the customer documentation do cover the hardware registers, but there's not much more than you can already infer from kbase anyway

10:15 <robmur01> the rest is basically just a mess of internal engineering specs, particularly for Midgard

10:17 <robmur01> Regardless of anything else, Arm simply doesn't have the resources to spend on writing up nice docs for 5+ year old products that are effectively obsolete, just to satisfy the curiosity of a few people on the internet

10:18 <phh> they can just release the source code if it's obsolete ;-)

10:19 <robmur01> if you want to figure things out from source code, you're far better off looking at panfrost ;)

10:20 <urjaman> I thought phh meant midgard in whatever HDL it is written in :P

10:20 <phh> urjaman: correct

10:21 <tomeu> for most humans, guess what robmur01 said still holds true? :p

10:22 <macc24> therefore, the only way to get datasheets is to get a job at arm, steal datasheets and get fired (yes this is a joke)

10:33 wwilly has joined #panfrost

10:39 <wwilly> hi, I would like to know if it's possible to use opencl with panfrost? I'm a bit new about all this drivers and mali stuff, I may have silly questions... I'm doing research on scheduling around big.LITTLE (odroid-xu3), my stuff is working pretty well against CFS and EAS for scheduling+DVFS cpu tasks, but I would like to introduce gpu workload, via openCL (for rodinia benchmark) or openGL (for video games). my system is current

10:39 <wwilly> ly "stock" debian 10, with a linux-stable-5.9.y (dirty by my stuff)

10:41 <macc24> icecream95: ^ ?

10:54 raster has joined #panfrost

11:34 <bbrezillon> icecream95, alyssa: ok, so I'll have the same problem with the varying mem pool (needed for indirect draws), the question is, should we allocate one per context (max varying mem pool size == 512MB of growable mem) or should we instead add a lock at the dev level to guarantee that vertex/tiler and fragment jobs are not inter-mixed?

11:50 thecycoone has quit [Ping timeout: 256 seconds]

11:50 thecycoone has joined #panfrost

12:57 Ashleee has quit [Ping timeout: 240 seconds]

14:00 davidlt_ is now known as davidlt

14:19 camus has joined #panfrost

14:21 kaspter has quit [Ping timeout: 265 seconds]

14:21 camus is now known as kaspter

15:06 tlwoerner has quit [Ping timeout: 246 seconds]

15:09 <alyssa> macc24: fired? try arrested...

15:13 <alyssa> bbrezillon: It's well established by now I don't know how to write multithreaded graphics code, don't ask me :|

15:15 kaspter has quit [Remote host closed the connection]

15:15 kaspter has joined #panfrost

15:19 <bbrezillon> alyssa: the question is more, is it okay to have 512M per context, or do we want to mutualize that in the multi-context case

15:20 <alyssa> bbrezillon: 512M GROWABLE per context will hit undebuggable OOMs, I suspect

15:24 <bbrezillon> yeah, that's also what I think

15:25 <bbrezillon> so I guess we can fix both the tiler and varying_mem_pool issue with a lock and keep them per-device

15:32 Ashleee has joined #panfrost

15:36 zkrx has quit [Ping timeout: 246 seconds]

15:37 nlhowell has quit [Ping timeout: 256 seconds]

15:52 zkrx has joined #panfrost

16:03 nlhowell has joined #panfrost

16:09 nlhowell has quit [Remote host closed the connection]

16:09 nlhowell has joined #panfrost

16:39 nlhowell has quit [Ping timeout: 258 seconds]

17:13 guillaume_g has quit [Quit: Konversation terminated!]

17:28 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

17:28 uis has joined #panfrost

17:43 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

17:43 uis has joined #panfrost

17:50 archetech has joined #panfrost

17:57 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

17:58 uis has joined #panfrost

18:28 * robmur01 ponders why enabling AFBC in Mesa would reduce CPU cache misses/evictions...

18:39 <macc24> am i allowed to complaing about dsi panels being broken here?

18:40 <icecream95> robmur01: When AFBC is enabled, texture uploads are done on the GPU, otherwise they are done on the CPU

18:49 <robmur01> icecream95: aha, that will probably do it indeed

18:51 <robmur01> S922 coherency is the new RK3399 voltage scaling...

18:53 <robmur01> except this time backporting the patches *is* a realistic proposition

19:01 <macc24> robmur01: huh?

19:09 <alyssa> robmur01: rip

19:31 kherbst has joined #panfrost

19:31 karolherbst has quit [Disconnected by services]

19:31 kherbst is now known as karolherbst

19:34 <amonakov> robmur01: if no datasheets, and no ISA manuals, can we at least get text assembly from Mali shader compiler (seeing how it's based on LLVM, so necessary internals should be there already)?

19:35 <HdkR> When writing an LLVM backend you don't actually need to wire up the disassembler bits

19:35 <alyssa> Mesa has that wired up for Bifrost at least

19:35 <alyssa> (compatible with the Mali shader compiler, with more or less 'official' assembly syntax)

19:36 <HdkR> Symbol names from the Mali driver also seems to imply that ARM never hooked up the LLVM disassembly bits

19:38 <amonakov> OTOH libraries shipped with malisc do have instruction names showing up in `strings`

19:39 <HdkR> tablegen will end up doing that

19:39 <HdkR> Gives you all the names for the MIR

19:39 <HdkR> Allows you to do an IR dump late in the pipe and get all the MIR names

19:49 <robmur01> macc24: as in, kernel thing that leads people to keep reporting Mesa bugs, and that we keep forgetting about because it's already sorted in mainline and CI

19:50 <robmur01> amonakov: no idea, you'd have to ask the people responsible for that (although I think I can guess at the answer...) - I'm just an OSS kernel guy ;)

19:50 davidlt has quit [Ping timeout: 256 seconds]

19:52 <amonakov> HdkR thanks for the elaboration; yeah, if not full-fledged asm, access to LLVM IR would be nice

19:55 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

19:55 uis has joined #panfrost

20:25 zkrx has quit [Ping timeout: 264 seconds]

20:26 clementp[m] has quit [*.net *.split]

20:37 zkrx has joined #panfrost

20:37 clementp[m] has joined #panfrost

20:51 <amonakov> bbrezillon: long division for split_div without 64-bit arithmetic: http://sprunge.us/HdJHeK

21:06 archetech has quit [Quit: Konversation terminated!]

21:09 uis has quit [Quit: ZNC 1.7.4 - https://znc.in]

21:09 uis has joined #panfrost

21:24 tlwoerner has joined #panfrost

21:37 zkrx has quit [Ping timeout: 264 seconds]

21:42 raster has quit [Quit: Gettin' stinky!]

21:46 rak-zero has quit [Ping timeout: 240 seconds]

22:01 zkrx has joined #panfrost

22:02 karolherbst has quit [Quit: duh 🐧]

22:05 karolherbst has joined #panfrost

22:15 macc24 has quit [Ping timeout: 246 seconds]

22:22 karolherbst has quit [Ping timeout: 264 seconds]

22:29 rak-zero has joined #panfrost

22:30 rak-zero has quit [Quit: ZNC 1.8.2 - https://znc.in]