#panfrost on 2021-03-12 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:09 <alyssa> italove: looking good on the disasm, I think if you cleanup the series (squashing so everything is bisectable, I mean -- no need to write expansive commit messages or anything) we're pretty close to landable :)

00:09 <alyssa> chewitt: snrkt

00:10 Danct12_ has quit [Quit: Quitting - Huong Tram IRC Client 1.54]

00:11 Danct12 has joined #panfrost

01:03 popolon has quit [Quit: WeeChat 3.0.1]

01:03 vstehle has quit [Ping timeout: 245 seconds]

01:39 stikonas has quit [Remote host closed the connection]

02:01 atler is now known as Guest91365

02:01 Guest91365 has quit [Killed (card.freenode.net (Nickname regained by services))]

02:01 atler has joined #panfrost

02:17 camus has joined #panfrost

02:18 kaspter has quit [Ping timeout: 256 seconds]

02:18 camus is now known as kaspter

02:34 <macc24_> alyssa: did bifrost optimizations make it into 21.0 release?

02:34 <alyssa> no

03:11 _whitelogger has joined #panfrost

03:53 archetech has quit [Quit: Konversation terminated!]

04:06 mixfix41 has quit [Ping timeout: 264 seconds]

04:07 davidlt has joined #panfrost

04:55 kaspter has quit [Ping timeout: 264 seconds]

04:56 kaspter has joined #panfrost

06:00 vstehle has joined #panfrost

06:06 cowsay has quit [Quit: No Ping reply in 180 seconds.]

06:08 cowsay has joined #panfrost

06:29 mani_s_ is now known as mani_s

07:12 guillaume_g has joined #panfrost

07:25 hexdump0815 has joined #panfrost

07:55 camus has joined #panfrost

07:57 kaspter has quit [Ping timeout: 265 seconds]

07:57 camus is now known as kaspter

08:43 MastaG has joined #panfrost

09:05 raster has joined #panfrost

09:06 <HdkR> Does Panfrost support threaded context yet? :)

09:20 catfella has quit [Remote host closed the connection]

10:01 stikonas has joined #panfrost

10:06 stikonas_ has joined #panfrost

10:08 camus has joined #panfrost

10:08 kaspter has quit [Ping timeout: 265 seconds]

10:08 camus is now known as kaspter

10:08 stikonas has quit [Ping timeout: 272 seconds]

10:28 camus has joined #panfrost

10:29 kaspter has quit [Ping timeout: 276 seconds]

10:29 camus is now known as kaspter

10:39 kherbst has quit [Ping timeout: 246 seconds]

10:40 stikonas_ is now known as stikonas

10:55 karolherbst has joined #panfrost

11:35 hexdump0815 has quit [Ping timeout: 240 seconds]

12:06 davidlt_ has joined #panfrost

12:06 davidlt has quit [Remote host closed the connection]

12:08 kaspter has quit [Ping timeout: 276 seconds]

12:08 camus has joined #panfrost

12:10 camus is now known as kaspter

12:53 <italove> alyssa: ok :)

13:11 Danct12 has quit [Quit: Quitting - Huong Tram IRC Client 1.54]

13:37 kaspter has quit [Ping timeout: 276 seconds]

13:38 kaspter has joined #panfrost

13:50 <alyssa> HdkR: not yet

13:50 <alyssa> though now that iris/fd do it shouldn't be so bad

13:53 <bbrezillon> stepri01: may I ask you a few questions about the NO_IMPLICIT/NO_FENCE case?

13:53 <stepri01> bbrezillon: sure

13:54 <bbrezillon> I guess the idea is to limit the number of fences to wait on to a single fence, so the scheduler gets to schedule our job as soon as this fence is signaled

13:54 <bbrezillon> is that correct?

13:56 <stepri01> there's no need to have to limit the number of fences as such. It's more about letting user space control the fences so you don't have unnecessary fences

13:57 <stepri01> Ultimiately user space (usually) has a good idea what the actual dependencies are between jobs, so letting it encode that rather than trying to deduce it from the implicit fences can be beneficial

13:57 <stepri01> e.g. you don't need to have implicit fences on things user space knows are effectively immutable - so we can save time by not processing those fences

13:58 <stepri01> equally there are complex situations such as sub-buffer accesses which user space can optimise by fencing appropriately, whereas the kernel doesn't know how they might or might not conflict

13:59 <stepri01> the blob/kbase mostly use the expicit fencing approach for the above reasons, only using implicit fencing when necessary because it's an imported buffer

13:59 <bbrezillon> ok, the sub-buffer case I had it, but the front-buffer update one you mentioned in your reply I don't see what it is

14:00 <stepri01> so in the normal double (or more) buffering case it makes sense the for display driver to hold a (shared) lock on the buffer that's being scanned out. That allows you to schedule a buffer swap and immediately send the kernel GPU work which would render to what was (and for a while still will be) the front buffer

14:00 <stepri01> The GPU work will block until the display driver releases the lock when flipping to a back buffer, unblocking the GPU and allowing the rendering to happen straight away

14:01 <stepri01> Clearly this falls down if for whatever reason you then want to actually render to the displayed buffer. Either you need to have a way of reconfiguring the display driver not to hold the lock (i.e. fence) or you need to convince the GPU driver to ignore the fence

14:02 <stepri01> Usually you can get away with using a shared access on the GPU (even though you are actually writing), but I seem to remember there are corner cases even with that

14:04 <bbrezillon> ok, but how does NO_IMPLICIT simplify/optimizes this case. I mean, I'd expect it to work similarly with the implicit fences: the GPU job will be blocked until the display controller signals the front-buffer fence

14:05 <bbrezillon> the only different being the number of fences to test

14:05 <bbrezillon> *difference

14:06 <stepri01> I think there's two things. First there is overhead juggling the unnecessary fences in the kernel - whether that's measurable I don't know.

14:07 <stepri01> Secondly you need to be able to use shared fences (a problem with the current Panfrost kernel) and you need to ensure that any other drivers you are working with also support shared fences

14:10 davidlt_ is now known as davidlt

14:41 <bbrezillon> stepri01: ok, after looking at the KMS API more closely I get why GPU drivers take a sync_file FD and not a syncobj: that's what atomic plane updates return (passing a syncobj would require importing the sync_file first)

14:42 Elpaulo has quit [Quit: Elpaulo]

14:42 <bbrezillon> and the NO_IMPLICIT mechanism makes more sense now. Thanks for the detailed explanation

14:42 <stepri01> no problem :)

15:51 <raster> oh noes! i got a guru meditation

15:52 <alyssa> ?

15:52 <raster> http://www.enlightenment.org/ss/e-604b8e5174bd85.19677253.jpg

15:54 <raster> that seems to be a result of...

15:54 <raster> [ 7126.575518] Internal error: Oops: 96000006 [#1] SMP

15:54 <raster> :)

15:55 <alyssa> "you wedged the GPU and the kernel is too broken to fix it"

15:55 <raster> https://termbin.com/mxny

15:56 <raster> and i had such good uptimne.. :| about 2h! :)

15:59 stikonas has quit [Remote host closed the connection]

15:59 stikonas has joined #panfrost

16:24 <daniels> stepri01: KMS synchronises against exclusive fences before making the framebuffer current, but that's the last involvement it has; as soon as it's synchronised against all fences placed before the commit was made, it doesn't do anything else related to fencing, including holding a shared reservation

16:26 <stepri01> daniels: To be fair I'm more familiar with how Android (used) to do these things, I'm not so familiar with KMS. There are also cases like a video encoder reading the buffer for use cases like casting to a remote screen.

16:26 <daniels> yep, the video encoder will take a shared fence, but I'd argue that the number of people doing active frontbuffer rendering (X11, XR) and simultaneous streaming from that frontbuffer are ... like none?

16:28 <stepri01> I think it can be done on Android (cast while using the phone as an VR headset), but it was a while ago when I was involved in such discussions. Like you say it's pretty rare

16:29 <daniels> racing the encoder against scanout seems pretty brave, but what do I know :P

16:31 <stepri01> Yeah - I can't remember the details these days. The main thing is that you need to have a design that can at least support both independently. And it's much better if the GPU doesn't need to change too much based on exact use case

16:35 <macc24_> HdkR: how would you test if threaded context actually works?

17:01 stepri01 has quit [Quit: leaving]

17:49 Depau has quit [Quit: ZNC 1.8.2 - https://znc.in]

17:49 Depau has joined #panfrost

18:07 popolon has joined #panfrost

18:10 warpme_ has quit [Quit: Connection closed for inactivity]

18:36 kaspter has quit [Ping timeout: 276 seconds]

18:36 kaspter has joined #panfrost

19:47 warpme_ has joined #panfrost

19:48 robmur01 has quit [Quit: Leaving]

20:35 davidlt has quit [Ping timeout: 264 seconds]

21:27 <icecream95> macc24_: I rebased my threaded_context branch onto master and it still seems to mostly work: https://gitlab.freedesktop.org/icecream95/mesa/-/commits/tc-rebase

21:28 * macc24_ yoinks the code

21:42 guillaume_g has quit [Quit: Konversation terminated!]

21:45 <alyssa> "mostly"

21:48 <icecream95> Although I'm sure STK was broken with the old branch, I haven't yet found anything that doesn't work this time

21:52 <alyssa> If you aren't breaking STK are you really writing a driver?

21:52 <macc24_> alyssa: STK didn't break in last month

21:52 <macc24_> iirc

21:52 <icecream95> That reminds me to bisect the latest STK regression

21:52 <macc24_> oh nvm

21:53 <alyssa> lol

21:53 <alyssa> I really like Gallium

22:00 <icecream95> The regression doesn't reproduce on master, so looks like I did break STK after all

22:06 <icecream95> macc24_: Note that my branch disables AFBC and the minmax cache so is likely to just slow things down at the moment

22:06 <macc24_> icecream95: is there any gain from afbc on duet?

22:08 <icecream95> macc24: Test for yourself (PAN_MESA_DEBUG=noafbc to disable AFBC)

22:08 <macc24_> icecream95: fun fact: you can usually press tab to autocomplete nick on irc in most clients

22:09 <macc24_> anyway, i

22:09 <macc24_> i'll test noafbc on odroid go advance when i'm done with getting analog stick to work

23:54 <HdkR> macc24_: Check for a bunch of new threads from mesa :P

23:55 <macc24_> HdkR: for (int i = 0; i < 5; i++) fork();

23:55 <HdkR> hah

23:55 kaspter has quit [Ping timeout: 276 seconds]

23:56 kaspter has joined #panfrost