#panfrost on 2020-10-28 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:21 alyssa has quit [Remote host closed the connection]

00:39 raster has quit [Quit: Gettin' stinky!]

00:56 stikonas has quit [Remote host closed the connection]

01:43 alyssa has joined #panfrost

01:47 kaspter has joined #panfrost

02:22 camus1 has joined #panfrost

02:24 kaspter has quit [Read error: Connection reset by peer]

02:24 camus1 is now known as kaspter

02:25 vstehle has quit [Ping timeout: 258 seconds]

02:57 archetech has quit [Quit: Leaving]

03:01 kaspter has quit [Ping timeout: 260 seconds]

03:02 kaspter has joined #panfrost

03:15 icecream95 has joined #panfrost

03:22 ezequielg has quit [Read error: Connection reset by peer]

03:22 ezequielg has joined #panfrost

03:30 icecream95 has quit [Quit: leaving]

04:25 davidlt has joined #panfrost

04:34 kaspter has quit [Ping timeout: 265 seconds]

04:34 kaspter has joined #panfrost

04:37 Green has quit [Quit: ...]

04:39 Green has joined #panfrost

04:52 kaspter has quit [Ping timeout: 256 seconds]

04:53 kaspter has joined #panfrost

05:09 kaspter has quit [Ping timeout: 265 seconds]

05:13 kaspter has joined #panfrost

05:15 davidlt has quit [Ping timeout: 272 seconds]

05:20 kaspter has quit [Ping timeout: 258 seconds]

05:20 kaspter has joined #panfrost

06:00 vstehle has joined #panfrost

06:28 archetech has joined #panfrost

07:29 _whitelogger has joined #panfrost

07:48 davidlt has joined #panfrost

08:33 <bbrezillon> warpme_: would you mind adding your Tested-by on danvet's patch?

09:03 raster has joined #panfrost

09:07 <warpme_> bbrezillon: oh yes. please :-)

09:09 stepri01 has joined #panfrost

09:33 <bbrezillon> warpme_: nevermind, it's already there

09:54 stikonas has joined #panfrost

11:35 archetech has quit [Quit: Textual IRC Client: www.textualapp.com]

11:56 yann|work is now known as yann

12:00 stepri01 has quit [Quit: leaving]

13:17 gcl_ has joined #panfrost

13:21 gcl has quit [Ping timeout: 272 seconds]

13:26 gcl has joined #panfrost

13:29 gcl_ has quit [Ping timeout: 260 seconds]

13:31 stepri01 has joined #panfrost

13:56 camus1 has joined #panfrost

13:56 kaspter has quit [Ping timeout: 258 seconds]

13:56 camus1 is now known as kaspter

14:21 <bbrezillon> stepri01: so, on a second thought, I'm not sure having the drm_sched_start() call protected by the reset_lock makes things easier. We still have a race if a timeout handler is called before the reset_lock is release, and in that case we never reset the GPU.

14:27 stikonas has quit [Remote host closed the connection]

14:28 stikonas has joined #panfrost

14:30 alpernebbi has joined #panfrost

14:30 stikonas_ has joined #panfrost

14:31 stikonas has quit [Read error: Connection reset by peer]

14:31 archetech has joined #panfrost

14:56 stikonas_ is now known as stikonas

15:28 gcl_ has joined #panfrost

15:31 gcl has quit [Ping timeout: 240 seconds]

15:53 <stepri01> bbrezillon: yeah I think we need a better way of coordinating the two schedulers. Rather than assuming that if the mutex is held then nothing needs to be done, the code should somehow check if the reset is going to happen and either bail out (if the reset is going to happen), or block waiting for the mutex to trigger another reset if necessary

15:58 <stepri01> kbase has a kbase_prepare_to_reset_gpu() function (and friends) which keeps track of whether a reset is in progress or not and ensures the dance happens correctly. But equally kbase doesn't have to deal with two schedulers driving the same GPU which is Panfrost's problem

15:58 <bbrezillon> yep

16:03 <bbrezillon> and drm_sched_job_timedout() seems to expect the timeout handling to happen synchronously (it restarts the timer after calling ->job_timedout()) , which is another problem

16:03 <bbrezillon> otherwise we could schedule another work to do the reset

16:04 <stepri01> yeah - kbase mostly makes reset aynchronous - it's natural for the GPU design, but sadly doesn't fit well with the DRM architecture

16:04 <stepri01> the other option is to simply stop using the reset hammer and actually handle job failure properly ;)

16:05 <bbrezillon> so maybe the solution is to allow asynchronous timeout handling

16:05 <stepri01> but there might be some bugs hiding which still require resets to recover from

16:05 <bbrezillon> yep, that's what I was about to ask

16:05 <bbrezillon> I guess we sometimes need a reset

16:05 <stepri01> kbase effectively has a watchdog for if the GPU stops behaving

16:05 <bbrezillon> so the problem remains

16:08 <bbrezillon> this being said, I'd like to find a fix that does not involve invasive changes :)

16:14 <tomeu> can't we just mainline kbase? :p

16:16 <alyssa> please no

16:16 <alyssa> :p

16:50 <Lyude> no

16:50 <Lyude> my response means nothing but i don't like kbase :P

16:51 <alyssa> tomeu: A channel op said it, how can you disagree ? :p

16:53 <kinkinkijkin> im not entirely sure what kbase is in this context and that frightens me

16:54 <alyssa> kinkinkijkin: Arm's kernel driver for mali midgard+

16:54 <kinkinkijkin> oh THAT

16:54 <macc24> nope nope nope nope

16:54 <alyssa> Notoriously legacy code, that's why we're here ;p

16:54 <kinkinkijkin> might as well merge the libmali blob into mesa while you're at it

16:55 <kinkinkijkin> blobs*

16:57 <HdkR> #include <libmali.so>

17:05 <daniels> i mean that was basically lima

17:05 <macc24> daniels: huh?

17:06 <daniels> the original lima driver used kbase, did extremely primitive command-stream construction in standalone demo programs, and linked to Mali's offline shader compiler DSO to do all the compilation

17:06 <macc24> that's... cursed...

17:18 felipealmeida has quit [Ping timeout: 256 seconds]

17:19 felipealmeida has joined #panfrost

17:39 <narmstrong> Luc Verhaegen is a weird guy, he spent an infinite amount of time r-e, but when asked to make something that could useful for long-term, he says na I prefer hacking my stuff on my side and do weird q3 demo while adapting the GL api to meet my hacks

17:40 <narmstrong> we lost 5years until Qiang take over, we could have lima upstream in linux & mesa in 2012

17:41 <narmstrong> it's insane

17:42 <HdkR> There's no need to bang on history. Everyone has different motivations that may not necessarily align with what people want.

17:45 <narmstrong> yeah no offense, he really changed thing by r-e lima

17:46 * alyssa <-- certified weirdo

17:47 <narmstrong> alyssa: all people on this channel could be certified weirdos :-)

17:48 <narmstrong> I mean idling on IRC on a GPU r-e dev channel

17:54 <alyssa> well.. :p

18:42 <kinkinkijkin> don't look at me i failed my weirdo certification failing to meet my exam date

18:43 <alyssa> 16 files changed, 777 insertions(+), 618 deletions(-)

18:43 <HdkR> Lucky 777

18:43 <alyssa> Let's see what CI says..

18:43 <kinkinkijkin> two deletions too far to meet 777 616

19:06 zkrx has quit [Quit: I'm done]

19:24 sphalerite has quit [Quit: nixos 20.09, here I come!]

19:34 raster has quit [Quit: Gettin' stinky!]

19:35 davidlt has quit [Ping timeout: 240 seconds]

19:40 zkrx has joined #panfrost

19:52 zkrx has quit [Quit: I'm done]

19:58 zkrx has joined #panfrost

20:41 raster has joined #panfrost

20:43 sphalerite has joined #panfrost

21:20 archetech has quit [Quit: Leaving]

21:28 Elpaulo has quit [Read error: Connection reset by peer]

21:30 Elpaulo has joined #panfrost

21:59 mmind00 has quit [*.net *.split]

22:01 mmind00 has joined #panfrost

22:20 kaspter has quit [Ping timeout: 260 seconds]

22:22 kaspter has joined #panfrost

22:27 remexre has quit [Read error: Connection reset by peer]

22:31 remexre has joined #panfrost

22:32 raster has quit [Quit: Gettin' stinky!]

22:36 raster has joined #panfrost

22:54 zkrx has quit [Quit: I'm done]

23:10 zkrx has joined #panfrost

23:29 alpernebbi has quit [Quit: alpernebbi]

23:45 karolherbst has quit [Quit: duh 🐧]