#panfrost on 2019-10-23 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:02 * urjaman bisected the Xorg/glamor hitting that assert i mentioned before

00:02 <urjaman> first bad commit: [b8c4fb235ef4055a14a9a2aec07f3f906ef8a841] pan/midgard: Implement SIMD-aware dead code elimination

00:03 <urjaman> "Results are meh." indeed xD

00:05 rhyskidd has joined #panfrost

00:23 rhyskidd has quit [Ping timeout: 276 seconds]

00:34 rhyskidd has joined #panfrost

00:45 <alyssa> urjaman: Quiet you ;p

00:50 <alyssa> urjaman: Mayhaps what HdkR said :)

00:51 <HdkR> I'm just crazy old man, I don't know what you're talking about

00:51 * alyssa is one of those three things

01:07 rhyskidd has quit [Ping timeout: 265 seconds]

01:23 _whitelogger has joined #panfrost

01:35 rhyskidd has joined #panfrost

01:40 * urjaman is maybe 2 of those things :P

01:41 <urjaman> anyways what i was trying to say before I updated mesa and crashed Xorg by breathing at it (yeah any interaction with the top bar, and alt-tabbing to another window could cause it)

01:42 <urjaman> was that somehow the scrolling-segfault (was basically if you scroll a gtk app, included mousepad, thunar, transmission) has at some point disappeared and instead is an odd graphical glitch on scroll

01:48 <urjaman> https://youtu.be/8uRgc3MTbg4

01:55 rhyskidd has quit [Ping timeout: 265 seconds]

02:18 rhyskidd has joined #panfrost

03:11 rhyskidd has quit [Ping timeout: 265 seconds]

03:21 davidlt has joined #panfrost

03:22 rhyskidd has joined #panfrost

03:38 rhyskidd has quit [Ping timeout: 265 seconds]

04:04 rhyskidd has joined #panfrost

04:25 rhyskidd has quit [Quit: rhyskidd]

04:48 fysa has joined #panfrost

04:48 fysa has quit [Remote host closed the connection]

04:50 rhyskidd has joined #panfrost

04:51 megi has quit [Ping timeout: 265 seconds]

05:10 davidlt_ has joined #panfrost

05:11 rhyskidd has quit [Ping timeout: 265 seconds]

05:13 davidlt has quit [Ping timeout: 268 seconds]

05:16 davidlt_ has quit [Ping timeout: 240 seconds]

05:17 davidlt has joined #panfrost

05:35 rhyskidd has joined #panfrost

05:54 rhyskidd has quit [Ping timeout: 240 seconds]

06:14 fysa has joined #panfrost

06:19 fysa has quit [Ping timeout: 245 seconds]

06:23 rhyskidd has joined #panfrost

06:24 <tomeu> robmur01: for some reason, rc4 isn't working on the veyron: https://gitlab.freedesktop.org/tomeu/mesa/-/jobs/791418

06:24 <tomeu> will try to reproduce locally

06:39 rhyskidd has quit [Ping timeout: 246 seconds]

06:46 fysa has joined #panfrost

06:51 fysa has quit [Ping timeout: 265 seconds]

06:52 rhyskidd has joined #panfrost

06:53 <tomeu> couldn't reproduce with the old mesa I have around here, so it could be related to the parallelization changes

06:53 <tomeu> rebuilding mesa now

06:53 <tomeu> bbrezillon: any ideas?

07:11 rhyskidd has quit [Quit: rhyskidd]

07:21 rhyskidd has joined #panfrost

07:29 paulk-leonov has quit [Ping timeout: 240 seconds]

07:40 <bbrezillon> tomeu: wow

07:42 <bbrezillon> tomeu: 5.4-rc4 + mesa master?

07:42 raster has joined #panfrost

07:46 <tomeu> bbrezillon: yeah, but I cannot reproduce it in this veyron I have here

07:48 <tomeu> have run 7 instances of deqp in parallel and all seems fine

07:49 <bbrezillon> tomeu: what's the old kernel version, 5.2?

07:50 <bbrezillon> it's definitely a kernel bug, but it might be triggered by my batch parallelization changes, indeed

07:50 <tomeu> bbrezillon: master is using 5.3.0-rc8 atm

07:52 <bbrezillon> tomeu: and it works fine, right? I guess it's worth looking at the panfrost changes that were merged between 5.3 and 5.4-rc4

07:52 <bbrezillon> I don't have time to look at it this week, but I can give it a try after ELCE

07:53 paulk-leonov has joined #panfrost

07:53 rhyskidd has quit [Ping timeout: 245 seconds]

07:55 <tomeu> bbrezillon: will do that, but I guess what would be most important is to have a way for me to reproduce

07:55 <tomeu> guess I will try with using the same kernel as gitlab

08:06 RCF is now known as rcf

08:13 <tomeu> bbrezillon: ok, managed to reproduce with the kernel from gitlab

08:14 <tomeu> looks like the GPU isn't powered on

08:14 <tomeu> [ 50.643093] panfrost ffa30000.gpu: gpu sched timeout, js=1, config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=a1cce9cf

08:14 <tomeu> [ 51.163069] panfrost ffa30000.gpu: gpu sched timeout, js=0, config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=930af481

08:14 <tomeu> as we're reading zeroes from the job registers

08:26 wens has joined #panfrost

08:35 fysa has joined #panfrost

08:35 rhyskidd has joined #panfrost

08:35 <tomeu> robher: I have a vague recollection of talking about such a regression on veyron at plumbers

08:37 Green has joined #panfrost

08:39 fysa has quit [Ping timeout: 240 seconds]

08:47 <wens> what regression? just did a bisect on veyron-speedy

08:48 <wens> looks like the regression I reported awhile back

08:48 <bbrezillon> tomeu: I wonder how it can trigger a kernel oops though

08:49 <wens> got it down to a handful of commits related to runtime PM

08:50 <wens> before commit 635430797d3f drm/panfrost: Rework runtime PM initialization, everything worked fine

08:51 <wens> the subsequent commits made X just crash, and after that bunch, the gpu timeout regressions

08:51 <wens> that was with some month old master branch of mesa. latest HEAD makes X crash.

08:51 <wens> about to drop back to 19.2

08:52 raster has quit [Quit: Gettin' stinky!]

08:52 <tomeu> wens: ah, it indeed seems to be it

08:52 <tomeu> I'm going to dig further on it, thanks for the info

08:52 <tomeu> awesome, managed to reproduce locally

08:53 <tomeu> seems to be dependent on the kconfig

08:54 rhyskidd has quit [Ping timeout: 265 seconds]

08:55 <urjaman> wens: if you want to run latest HEAD (or atleast as of yesterday-ish) in Xorg, revert b8c4fb235ef4055a14a9a2aec07f3f906ef8a841

08:56 <urjaman> (of mesa that is)

09:00 <wens> tomeu: you want my kernel config?

09:02 <tomeu> wens: no need, was able to reproduce it

09:02 <tomeu> thanks

09:20 raster has joined #panfrost

09:23 rhyskidd has joined #panfrost

09:29 <tomeu> wens: guess you don't build panfrost as a module?

09:32 megi has joined #panfrost

09:37 <tomeu> looks like, when panfrost is built-in, the pm_runtime callbacks aren't being called

09:39 rhyskidd has quit [Ping timeout: 265 seconds]

09:52 <wens> tomeu: no, it's easier to have most or all drivers built-in, and just sign the zImage

09:52 <wens> I still have chromeos on my device

10:10 warpme_ has joined #panfrost

10:13 <tomeu> ok, that's why

10:16 <urjaman> why built-in or why pm callbacks not being called? :P

10:17 <wens> urjaman: why only some people run into it? :)

10:22 <tomeu> pm runtime callbacks not being called seems to be related to the driver being built-in

10:22 <tomeu> probably because it's probed before something else

10:39 rhyskidd has joined #panfrost

11:15 BenG83 has quit [Ping timeout: 252 seconds]

11:39 BenG83 has joined #panfrost

11:48 BenG83 has quit [Ping timeout: 240 seconds]

12:04 <tomeu> wens: if you give me a name and email address, I will aatribute you the reporting of the bug

12:05 * tomeu has a one-liner to fix the kernel

12:07 BenG83 has joined #panfrost

12:09 <tomeu> wens: np, I have already found you :)

12:23 <tomeu> wens: testing welcome: https://lore.kernel.org/lkml/20191023122157.32067-1-tomeu.vizoso@collabora.com/

12:24 <wens> my name is probably the easiet to find in the kernel. it's in the maintainer's PGP guide :p

12:24 <wens> tomeu: I'll give it a spin tomorrow

12:24 <tomeu> nice!

12:25 <alyssa> urjaman: FWIW I have the wacky scroll glitch reproduced locally

12:28 <wens> tomeu: why is it v2 though?

12:31 <tomeu> wens: because I have been using gitlab mrs for too long a time and I botched git-sendmail

12:31 <tomeu> and I left your reported-by

12:33 <alyssa> I miss mailing lists.

12:33 <alyssa> Review on MRs is just so uncomfortable

12:33 <alyssa> I did learn about appending .patch to the URL (which is what I do with GitHub) yesterday, so that makes MR review massively less annoying

12:34 <alyssa> Can just open that all up locally and review like email

12:35 <tomeu> maybe after a while you will miss gitlab when having to send patches :p

12:36 <alyssa> Hmmm?

12:36 <alyssa> ML is more convenient in both directions

12:36 <alyssa> Only thing gitlab wins on is ease of applying other people's patches

12:36 <alyssa> and now the CI stuff

12:36 <tomeu> just mentioning that convenience is related to habit as well

12:37 <alyssa> GitLab is literally slower though

13:10 <TheCycoONE> heh

13:15 <alyssa> There. I downloaded the patch with .patch

13:15 <alyssa> Edited in vim like I was respondong to mail

13:15 <alyssa> and pasted my responses in ```tags```

13:15 <alyssa> This Is Fine

13:27 <tomeu> everybody happy!

13:27 <alyssa> \o/

13:27 <alyssa> Compromise! :p

13:43 fysa has joined #panfrost

13:47 fysa has quit [Ping timeout: 276 seconds]

14:25 fysa has joined #panfrost

14:26 <tomeu> narmstrong: btw, are you planning to re-enable lima CI?

14:30 fysa has quit [Ping timeout: 265 seconds]

14:42 <narmstrong> tomeu: I should

14:59 fysa has joined #panfrost

15:04 fysa has quit [Ping timeout: 240 seconds]

15:19 enunes has quit [Read error: Connection reset by peer]

15:20 enunes has joined #panfrost

15:23 vstehle has quit [Quit: WeeChat 2.6]

15:25 vstehle has joined #panfrost

16:00 fysa has joined #panfrost

16:16 fysa has quit [Remote host closed the connection]

16:16 fysa has joined #panfrost

16:17 <narmstrong> tomeu: i will, when more h3 boards are hooked on the lab

17:02 <anarsoul> narmstrong: tomeu: what about these lafrite boards?

17:12 <narmstrong> anarsoul: jobs are sent to potato board, which has the same gpu and twice memory

17:12 <anarsoul> I thought potato had midgard

17:13 <anarsoul> anyway, lafrite is more than enough to run deqp

17:25 <narmstrong> We have more potato boards now in the lab, but whatever they share the same die but only the package differs between s905x and s805x

17:30 stikonas has joined #panfrost

17:34 <anarsoul> narmstrong: I see

17:54 raster has quit [Quit: Gettin' stinky!]

18:23 raster has joined #panfrost

18:30 adjtm has quit [Ping timeout: 252 seconds]

18:41 TheKit has quit [Remote host closed the connection]

18:45 enunes has quit [Ping timeout: 268 seconds]

19:42 enunes has joined #panfrost

19:57 adjtm has joined #panfrost

20:23 janrinze has quit [Remote host closed the connection]

20:24 davidlt has quit [Ping timeout: 276 seconds]

21:25 TheKit has joined #panfrost

21:54 raster has quit [Quit: Gettin' stinky!]

22:14 * urjaman has maybe figured out something about the memory corruption on exit of openscad

22:15 <urjaman> the valgrind run effectively said that the block it tries to touch (during context destruction) was freed during a previous context destruction

22:16 <urjaman> so i put a breakpoint on the context destruction and yeah openscad opens atleast 3 contexts, and the canary assert happens on the third context destruction...