#panfrost on 2020-08-18 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:34 stikonas has quit [Remote host closed the connection]

00:42 yann has quit [Ping timeout: 246 seconds]

01:35 kaspter has joined #panfrost

02:03 kaspter has quit [Ping timeout: 240 seconds]

02:03 kaspter has joined #panfrost

03:13 buzzmarshall has quit [Remote host closed the connection]

03:27 l-as has quit [Ping timeout: 246 seconds]

03:28 milkii has quit [Ping timeout: 272 seconds]

03:29 clementp[m] has quit [Remote host closed the connection]

03:29 nhp[m] has quit [Remote host closed the connection]

03:29 Ke has quit [Remote host closed the connection]

03:29 milkii has joined #panfrost

03:34 marcodiego has quit [Quit: Leaving]

03:34 clementp[m] has joined #panfrost

03:53 l-as has joined #panfrost

03:53 Ke has joined #panfrost

03:53 nhp[m] has joined #panfrost

04:15 davidlt has joined #panfrost

04:48 tomboy65 has quit [Read error: Connection reset by peer]

05:15 tomboy65 has joined #panfrost

05:59 tomboy65 has quit [Read error: Connection reset by peer]

06:05 tomboy65 has joined #panfrost

06:38 guillaume_g has joined #panfrost

06:42 <tomeu> bbrezillon: indeed ouch :(

06:47 kaspter has quit [Ping timeout: 240 seconds]

06:48 kaspter has joined #panfrost

07:06 warpme_ has quit [Quit: Connection closed for inactivity]

07:07 raster has joined #panfrost

07:08 <tomeu> bbrezillon: alyssa: one more instance: https://gitlab.freedesktop.org/jekstrand/mesa/-/jobs/4117639

07:08 <tomeu> alyssa: could it be related to a recent change? I haven't seen those in a long time

07:14 camus1 has joined #panfrost

07:15 kaspter has quit [Ping timeout: 246 seconds]

07:15 camus1 is now known as kaspter

07:21 camus1 has joined #panfrost

07:22 kaspter has quit [Ping timeout: 256 seconds]

07:22 camus1 is now known as kaspter

07:24 unoccupied has joined #panfrost

07:24 BenG83 has joined #panfrost

07:38 BenG83 has quit [Ping timeout: 246 seconds]

07:49 <bbrezillon> tomeu: had a quick look at the T76X HW issues defined in mali_kbase, and nothing related to AS_ACTIVE stuck popped up

07:49 <bbrezillon> but I wonder if we shouldn't reset the GPU when that happens

07:52 <tomeu> hmm, I thought that precisely happened when resetting the GPU

07:53 <tomeu> maybe we should reset harder when that happens

07:54 <bbrezillon> there doesn't seem to be a timeout prior to those "AS_ACTIVE bit stuck" messages in https://gitlab.freedesktop.org/bbrezillon/mesa/-/jobs/4125895

07:56 <bbrezillon> my guess is that the timeout happens because the MMU is stuck and the job wants to do a flush/invalidate

08:00 <bbrezillon> oh, and we also ignore the return of write_cmd(), meaning that the MMU command might be skipped entirely without ever blocking the rest of the submission

08:11 yann has joined #panfrost

08:12 clementp[m] has quit [Quit: killed]

08:12 Ke has quit [Quit: killed]

08:12 nhp[m] has quit [Quit: killed]

08:13 l-as has quit [Quit: killed]

08:19 clementp[m] has joined #panfrost

08:22 <tomeu> ah, that's bad in itself

08:23 <tomeu> so what we should do: propagate errors and don't try to submit a job if we weren't able to prepare its AS, and reset the whole GPU if the MMU appears stuck?

08:27 <bbrezillon> sounds like a sane approach

08:34 <tomeu> hmm, or maybe propagate errors so submission fails, and reset the GPU whenever that happens?

08:39 l-as has joined #panfrost

08:39 Ke has joined #panfrost

08:39 nhp[m] has joined #panfrost

08:45 icecream95 has joined #panfrost

09:00 <icecream95> bbrezillon: tomeu: AS_ACTIVE got stuck three times on rk3288-veyron-jaq-cbg-0 this month but none on -1

09:00 <tomeu> hmm

09:00 <tomeu> and when did it happen for the first time?

09:01 <icecream95> I suspect that -0 needs more voltage when running the GPU at 600MHz than -1

09:03 <icecream95> The kernel used for CI still doesn't have dynamic voltage scaling, right?

09:05 <tomeu> not yet, indeed

09:05 <tomeu> could be that, let me check where the patches are

09:11 warpme_ has joined #panfrost

09:15 <icecream95> To confirm, try adding 'echo 600000000 >/sys/class/devfreq/*.gpu/min_freq' before dEQP runs and see if that causes -0 to fail

09:31 <icecream95> alyssa: I use zram for swap (zstd, currently 3.6G/5G used with 50% compression) and have never had OOM issues like you mention

09:35 <icecream95> I think this memory leak I reported 2 months ago is still at large: https://gitlab.freedesktop.org/snippets/1031

09:43 stikonas has joined #panfrost

09:57 <daniels> icecream95: that's a _really_ good spot, thankyou! I know we've had issues with -0 and not -1 in the past; they should have identical firmware but even that shouldn't matter as the kernel sets up the whole clock tree; I wonder if it's having thermal issues, or if it's simply just a bit older and needs to be put out to pasture

10:01 <daniels> robmur01: could you please register an account on https://gitlab.freedesktop.org so I can harass you there? :)

10:13 <icecream95> daniels: Because dynamic voltage scaling isn't being used, the GPU is kept at the same low voltage the firmware sets it to. I think -1 can undervolt better, so still works at the low voltage, but -0 needs a higher voltage to be stable

10:13 <icecream95> Setting a maximum frequency of 400 MHz until voltage scaling arrives should make it more stable: echo 400000000 >/sys/class/devfreq/*.gpu/max_freq

10:22 icecream95 has quit [Ping timeout: 240 seconds]

10:40 <robmur01> daniels: you know I'm just the pagetable guy, right? :P

10:41 <daniels> think of it as a personal growth plan?

10:42 <daniels> (more seriously, does this mean I should be tagging stepri01 for non-MMU things?)

10:47 <robmur01> Why yes Office 365, the confirmation email most definitely deserves to be quarantined as a phishing attempt. Sigh...

10:48 <daniels> Office365 is generally pretty skeptical of fd.o due to the volume of spam which comes through Mailman

10:48 <robmur01> daniels: technically Steve and RobH are more officially involved than I am

10:48 <daniels> sure :)

10:49 <robmur01> I'm mostly squeezing it under my general "upstream kernel support" remit because it's more fun and interesting than reviewing SMMU patches ;)

10:53 <robmur01> anyway, I'm in - usual work username because laziness

11:25 davidlt has quit [Ping timeout: 246 seconds]

11:37 nlhowell has quit [Ping timeout: 246 seconds]

11:40 davidlt has joined #panfrost

11:46 <alyssa> tomeu: bbrezillon: I first saw that with the genxml attribute/varying series but I couldn't bisect it since nondeterminism and nothing stood out as wrong so I thought it was a fluke..

11:47 <alyssa> "think of it as a personal growth plan?" lol

11:48 <alyssa> robmur01: the trick is to just quarantine EVERYTHING, as 2020 has taught us :p

11:49 <alyssa> icecream95: hm, interesting. It's certainly a lot better on 4gb than 2gb as mentioned. I also run without swap at all since I'm stubborn, so that isn't helping :)

11:55 nlhowell has joined #panfrost

11:59 <alyssa> icecream95: Oh, TIL about heaptrack, neat!

11:59 <alyssa> seems a lot more pleasant to use for leaks than valgrind :)

12:01 <tomeu> bbrezillon: ouch :( https://gitlab.freedesktop.org/tomeu/mesa/-/jobs/4135752

12:02 <bbrezillon> tomeu: is it caused by my patch?

12:02 <bbrezillon> do you have a branch I can look at?

12:03 <tomeu> bbrezillon: no, yours isn't in https://gitlab.freedesktop.org/tomeu/linux/-/tree/v5.8-for-mesa-ci

12:03 <tomeu> wonder if any of the panfrost patches I backported depend on stuff outside of panfrot

12:06 <tomeu> trying now with drm-misc-next

12:15 nlhowell has quit [Ping timeout: 246 seconds]

13:06 BenG83 has joined #panfrost

13:19 raster has quit [Remote host closed the connection]

13:21 raster has joined #panfrost

13:24 kaspter has quit [Quit: kaspter]

13:53 <macc24> does panfrost support S3TC textures?

14:01 nlhowell has joined #panfrost

14:03 davidlt has quit [Remote host closed the connection]

14:03 davidlt has joined #panfrost

14:16 tgall_foo has quit [Quit: Textual IRC Client: www.textualapp.com]

14:28 <alyssa> macc24: yes, if your device supports it

14:28 <alyssa> rk3399 does, rk3288 does not

14:29 <macc24> :(

14:30 <macc24> alyssa: can those textures be implemented on rk3288 in software?

14:34 <urjaman> i think mesa provides a software implementation ... or was it for some other required format?

14:34 <macc24> when i try to run anything that needs s3tc it complains that it is missing

14:35 <urjaman> okay was something else then (or related to details about when it is mandatory...)

14:51 tgall_foo has joined #panfrost

14:59 <macc24> https://www.reddit.com/r/hmmmgifs/comments/ic011q/hmmm/

14:59 <macc24> oops

15:02 raster has quit [Quit: Gettin' stinky!]

15:02 davidlt has quit [Read error: Connection reset by peer]

15:09 raster has joined #panfrost

15:23 davidlt has joined #panfrost

16:01 guillaume_g has quit [Quit: Konversation terminated!]

16:05 Elpaulo has quit [Read error: Connection reset by peer]

16:07 Elpaulo has joined #panfrost

16:24 BenG83 has quit [Ping timeout: 246 seconds]

16:30 <HdkR> alyssa: Can confirm, heaptrack is great

16:30 <HdkR> Really helped me smash down small allocations

16:56 <alyssa> HdkR: :D

17:22 <HdkR> TFW hunting a SIGBUS in an application that catches SIGBUS

17:35 raster has quit [Remote host closed the connection]

17:42 <alyssa> ;-;

17:46 <HdkR> What's even more fun is that it seems to be a SIGBUS that my SIGBUS handler just doesn't catch ¯\_(ツ)_/¯

17:48 <HdkR> Oh frick frack, I missed a commit, so sigprocmask was killing it :|

18:02 <urjaman> catching SIGBUS sounds oddly like you're talking public transit :P

18:04 <HdkR> I tried to catch the SIGBUS but it turns out it was SIGILL

18:05 gcl_ has joined #panfrost

18:06 gcl has quit [Ping timeout: 240 seconds]

18:18 gcl_ has quit [Ping timeout: 246 seconds]

18:19 <HdkR> Oh, Valhal device is arriving today

18:19 <HdkR> Valhall even

18:25 gcl has joined #panfrost

18:27 <Lyude> HdkR: see you seen in Valhall[a]

18:27 <HdkR> :P

18:28 davidlt has quit [Ping timeout: 240 seconds]

18:33 stikonas has quit [Remote host closed the connection]

18:33 davidlt has joined #panfrost

18:34 jgmdev has joined #panfrost

18:38 jgmdev has quit [Client Quit]

18:56 AreaScout_ has quit [Ping timeout: 240 seconds]

19:02 ezequielg has quit [Read error: Connection reset by peer]

19:03 enunes has quit [Read error: Connection reset by peer]

19:04 enunes has joined #panfrost

19:09 ezequielg has joined #panfrost

19:20 enunes has quit [Ping timeout: 240 seconds]

19:46 davidlt has quit [Ping timeout: 256 seconds]

19:49 stikonas has joined #panfrost

19:49 enunes has joined #panfrost

20:04 buzzmarshall has joined #panfrost

20:35 <Lyude> tomeu: do you have any idea how the panfrost tests in IGT get built for autotools? I thought this would be more obvious but I don't see anything listed in tests/Makefile.sources

20:35 <alyssa> Lyude: *distant voice* they don't

20:36 <Lyude> alyssa: figured it might be something like that, I'm just a little surprised because it seems like something we test in CI according to the gitlab pipeline from here: https://patchwork.freedesktop.org/series/74811/

20:36 <Lyude> oh wait

20:36 <Lyude> duh, it says right there | grep -v vc4\|v4d\|panfrost

20:36 * alyssa shrugs

20:37 * Lyude has answered her question :), will just make nouveau exempt from that check as well

20:41 <HdkR> Valhall is here

20:42 <alyssa> HdkR: Woo!

20:42 <alyssa> what about godot

20:43 <HdkR> pfft

20:43 <HdkR> We seem to have a fun person over here

20:48 <alyssa> Who, Godot?

20:53 enunes has quit [Quit: ZNC - https://znc.in]

20:54 <HdkR> :>

20:55 <HdkR> Oh wow, it didn't even have the typical setup process on it

21:02 <HdkR> Now lets see if I can remember how to build something using the android NDK

21:05 <alyssa> :d

21:09 <HdkR> There we go, did it

21:13 * HdkR generates new es2_info

21:14 BenG83 has joined #panfrost

21:21 BenG83 has quit [Quit: Leaving]

21:28 <HdkR> https://pastebin.com/LcAJV06p

21:28 <HdkR> There we go. G77 es2_info

21:29 <HdkR> and of course my desktop would lock up right when I send that

21:29 <alyssa> and it's a he! hi! coming down the plains!

21:30 <HdkR> lol

21:30 <HdkR> TFW a USB hub causes a kernel panic

21:31 <alyssa> Android called?

21:31 <alyssa> (TFW video conferencing causes a kernel panic, my alptop called)

21:31 <HdkR> There's a couple of devices in a chain, hard to know exactly which one caused it

21:32 <HdkR> I blame the final hub, it has had issues in the past

21:34 * HdkR orders a new less derpy hub

21:34 <alyssa> muffin?

21:34 <HdkR> zucchini cake

21:36 <HdkR> Now where di I put that triangle drawing code

21:41 <HdkR> Alright, now where is pantrace

21:52 <alyssa> I ate it

21:52 <alyssa> ("the muffin?" "no, pantrace.")

22:17 <HdkR> Hm, looks like panwrap is looping on its injection points?

22:22 <HdkR> Hm, what is /vendor/etc/meow.cfg

22:22 <urjaman> lmao

22:25 <HdkR> Hmmmm, why is it opening `/proc/getppid()/cmdline` a dozen times and pushing ioctls through that

22:25 <HdkR> Things are happening here

22:42 austriancoder has quit [Ping timeout: 244 seconds]

22:43 lvrp16 has quit [Read error: Connection reset by peer]

22:51 austriancoder has joined #panfrost

22:52 lvrp16 has joined #panfrost

23:12 <HdkR> Yea, definitely missing files being opened

23:13 <HdkR> FDs jump from 6 to 12 without catching what is opened inbetween

23:14 <HdkR> If this was Linux I could just strace

23:23 <alyssa> HdkR: meow.cfg, beautiful

23:24 <HdkR> :D

23:28 <HdkR> LD_PRELOAD is behaving a bit weirdly as well. constructor is running after a few logs are being pushed through...?

23:35 raster has joined #panfrost

23:36 macc24_ has joined #panfrost

23:36 macc24_ has quit [Client Quit]

23:38 yann has quit [Ping timeout: 256 seconds]

23:50 yann has joined #panfrost