#panfrost on 2020-12-17 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:09 tgall_foo has joined #panfrost

00:30 raster has quit [Quit: Gettin' stinky!]

00:30 archetech has quit [Quit: Konversation terminated!]

00:36 stikonas has quit [Remote host closed the connection]

00:54 archetech has joined #panfrost

01:24 warpme_ has quit [*.net *.split]

01:24 leper` has quit [*.net *.split]

01:24 cphealy has quit [*.net *.split]

01:24 jschwart has quit [*.net *.split]

01:24 milkii has quit [*.net *.split]

01:24 Stary has quit [*.net *.split]

01:24 lvrp16 has quit [*.net *.split]

01:24 dschuermann has quit [*.net *.split]

01:24 Stenzek has quit [*.net *.split]

01:24 mifritscher has quit [*.net *.split]

01:24 chrisf has quit [*.net *.split]

01:26 enty has joined #panfrost

01:27 ente has quit [Read error: Connection reset by peer]

01:28 popolon has quit [Quit: WeeChat 2.9]

01:30 warpme_ has joined #panfrost

01:30 milkii has joined #panfrost

01:30 jschwart has joined #panfrost

01:30 dschuermann has joined #panfrost

01:30 Stary has joined #panfrost

01:30 chrisf has joined #panfrost

01:30 Stenzek has joined #panfrost

01:30 lvrp16 has joined #panfrost

01:30 leper` has joined #panfrost

01:30 cphealy has joined #panfrost

01:30 mifritscher has joined #panfrost

01:51 <alyssa> https://docs.mesa3d.org/drivers/panfrost.html

01:52 <HdkR> woo

02:03 SolidHal has quit [Quit: Ping timeout (120 seconds)]

02:05 <anarsoul> alyssa: lima doesn't support mali200

02:05 <anarsoul> it supports mali450 though

02:05 <anarsoul> we didn't port any quirks for it from libv's driver

02:05 <anarsoul> thanks for mentioning lima anyway :)

02:05 <alyssa> anarsoul: We accept patches :p

02:11 kaspter has joined #panfrost

02:15 vstehle has quit [Ping timeout: 264 seconds]

03:44 kaspter has quit [Ping timeout: 260 seconds]

03:44 kaspter has joined #panfrost

04:17 hl has quit [Ping timeout: 256 seconds]

04:20 hl has joined #panfrost

05:17 archetech has quit [Quit: Konversation terminated!]

05:56 nlhowell has joined #panfrost

06:00 tgall_foo has quit [Quit: My iMac has gone to sleep. ZZZzzz…]

06:00 vstehle has joined #panfrost

06:49 davidlt has joined #panfrost

07:15 tomboy64 has quit [Ping timeout: 240 seconds]

07:17 tomboy64 has joined #panfrost

07:49 karolherbst has joined #panfrost

09:09 raster has joined #panfrost

09:46 <tomeu> alyssa: regarding the INSTR_ENC flakiness, maybe we could catch such exceptions and dump the shader to a file?

09:47 <tomeu> then diff it with the same shader when it succeeds

09:56 tgall_foo has joined #panfrost

10:00 camus has joined #panfrost

10:01 stikonas has joined #panfrost

10:01 kaspter has quit [Ping timeout: 240 seconds]

10:01 camus is now known as kaspter

10:48 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

10:48 jernej has joined #panfrost

10:49 dstzd has quit [Quit: ZNC - https://znc.in]

10:51 jernej has quit [Client Quit]

10:53 jernej has joined #panfrost

10:59 kaspter has quit [Ping timeout: 268 seconds]

10:59 kaspter has joined #panfrost

11:07 dstzd has joined #panfrost

11:40 stikonas has quit [Remote host closed the connection]

11:40 stikonas has joined #panfrost

11:51 <robmur01> unless of course it's a GPU-side caching issue - in that case it might possibly be worth trying to have a GPU job memcpy the shader somewhere else and flush that out properly, so you can then maybe get some idea of what it saw vs. what was supposed to be there

11:54 <macc24> alyssa: wooo, documentation

11:54 karolherbst has quit [Remote host closed the connection]

12:11 Venemo has quit [Quit: ZNC 1.7.3 - https://znc.in]

12:11 Venemo has joined #panfrost

12:17 Venemo has quit [Quit: ZNC 1.8.2 - https://znc.in]

12:18 Venemo has joined #panfrost

12:25 kaspter has quit [Quit: kaspter]

12:34 <alyssa> robmur01: Purely compiler side changes shouldn't affect that.. I hope

12:40 <robmur01> sure, Occam's razor says the first and only thing to do at this point is confirm whether we really are always generating what we think we're generating

12:40 <robmur01> I just like to think ahead :)

12:41 <alyssa> :)

12:55 <alyssa> robmur01: Botched packing.

12:55 <alyssa> Slept on it and it was obvious :)

12:56 <robmur01> well yeah, it's probably all squashed if you slept on it :P

12:56 <alyssa> ?

12:56 <urjaman> too early for alyssa? :P

12:57 <alyssa> urjaman: little bit :p

12:57 <urjaman> (the packing, if you slept on it ... )

12:58 <alyssa> ah

13:05 <alyssa> Hmmm

13:11 <alyssa> Goes away if I force constants with every clause but that's.. wasteful..

13:17 <alyssa> oh, that's embarassing

13:23 * alyssa was missing message types on a few instructions

13:23 <alyssa> though... still having issues...

13:31 kaspter has joined #panfrost

13:36 dstzd has quit [Quit: ZNC - https://znc.in]

13:37 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

13:37 dstzd has joined #panfrost

13:38 dstzd has quit [Client Quit]

13:39 dstzd has joined #panfrost

13:40 dstzd has quit [Client Quit]

13:41 dstzd has joined #panfrost

13:41 jernej has joined #panfrost

13:45 karolherbst has joined #panfrost

13:57 stikonas has quit [Remote host closed the connection]

13:58 stikonas has joined #panfrost

13:59 <kinkinkijkin> i now have access to a fire tablet 7

13:59 <kinkinkijkin> 2019 i believe

14:13 <alyssa> 2019.. those were the days

14:13 popolon has joined #panfrost

14:13 <alyssa> bbrezillon: Can you confirm that dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_vertex is failing even with your fixes?

14:14 <alyssa> Trying to make sure this isn't a regression

14:15 <macc24> kinkinkijkin: cool

14:17 <macc24> kinkinkijkin: maybe-useful link: https://developer.amazon.com/docs/fire-tablets/ft-device-specifications-fire-models.html

14:36 stikonas has quit [Remote host closed the connection]

14:37 stikonas has joined #panfrost

14:40 <alyssa> daniels: InfrastructureError? https://gitlab.freedesktop.org/alyssa/mesa/-/jobs/6231655

14:40 <bbrezillon> alyssa: will do, I just want to finish something on the AFBC series first

14:40 <alyssa> 👍

14:41 <alyssa> bbrezillon: no rush.. probably will run out of time for the week shortly though.. not sure how we want to handle that with christmas coming up

14:42 <daniels> alyssa: line 608 -> 'network gremlins, couldn't pull my kernel'

14:45 <alyssa> I just write compilers 🤷

14:47 <daniels> -> whack retry (I did)

14:50 <macc24> does ping reach the machine consistently?

14:56 alyssa has quit [Remote host closed the connection]

14:57 alyssa has joined #panfrost

14:57 <alyssa> Pass: 15241, Fail: 1069, Crash: 4, UnexpectedPass: 1,

14:57 <alyssa> 93%, that's still an A! ;p

14:57 <macc24> only 4 crashes!

14:58 <alyssa> dEQP-GLES2.functional.uniform_api.random.79 is the unexpected pass

14:58 <macc24> what's dEQP?

15:08 <HdkR> Flakey network? Nobody has time for that.

15:08 <HdkR> Also me, my wifi access point restarts every month automatically

15:13 <HdkR> At least that is only two minutes of down time and a few hours of inconsistent performance for non essential devices...

15:47 zkrx has quit [Quit: I'm done]

16:09 nlhowell has quit [Ping timeout: 246 seconds]

16:14 <bbrezillon> alyssa: I'll try to review/test your MR tomorrow (I was planning to do it today, but ended up fixing other AFBC issues)

16:22 <alyssa> bbrezillon: definitely not ready for reviewing let alone testing

16:23 <alyssa> Just wanted to let people know it's there

16:23 <alyssa> handling swizzles well will involve rewriting a lot of the history again

16:25 <macc24> who said testing?

16:26 <alyssa> bbrezillon: did

16:26 <alyssa> macc24: test his AFBC code o:)

16:37 <macc24> where

16:37 <macc24> bifrost?

17:07 <daniels> macc24: ping doesn't reach the machine consistently because it's mostly powered off

17:09 kaspter has quit [Quit: kaspter]

17:12 nlhowell has joined #panfrost

17:14 nlhowell has quit [Client Quit]

17:14 <alyssa> how does that work? :o

17:15 <alyssa> oh, I guess LAVA controls power

17:16 <daniels> indeed

17:16 <daniels> because no use booting on a device where someone has terminally wedged the GPU

17:16 <daniels> or put it into swap death

17:16 <daniels> or anything

17:16 nlhowell has joined #panfrost

17:17 <alyssa> oh hey I do both of things on the daily ;p

17:18 <macc24> someone (probably) broke kexec on arm64 :(

17:19 <daniels> shrug, kexec just adds complexity rather than improves the situation

17:19 <macc24> kexec is really useful for me

17:19 <macc24> i don't have to dd kernel onto kernel partition and hope that it boots uip

17:27 <macc24> oh, and kinkinkijkin, don't get attached to your current cadmium installation ;), it's likely that cadmium will get support for updating kernel with multiple partitions to have theoretically unbrickable device

17:27 rando25892 has quit [Read error: Connection reset by peer]

17:40 <tlwoerner> is there a trick to getting kmscube running on a rock-pi-4?

17:40 <macc24> kmscube -d /dev/dri/card1?

17:41 <tlwoerner> https://pastebin.com/yJgLh56u

17:42 <tlwoerner> macc24: device is -D, but same

17:42 <macc24> uhh

17:42 <macc24> i think i left stove on at home

17:42 macc24 has left #panfrost ["WeeChat 2.9"]

17:43 <tlwoerner> lol :-)

18:02 macc24 has joined #panfrost

18:03 karolherbst has quit [Quit: duh 🐧]

18:05 <robmur01> tlwoerner: IIRC at some point not too long ago there was some impedance mismatch between Mesa and the Rockchip DRM driver around modifiers and AFBC

18:06 <robmur01> it's working fine on my RK3399 with Mesa 20.3 and kernel 5.8

18:10 <tlwoerner> robmur01: i'm using mesa 20.2.4 and kernel 5.8.18

18:10 <tlwoerner> robmur01: any pointers where i can start looking?

18:12 <tlwoerner> robmur01: ahh, sorry, i'll try a more recent mesa (duh!)

18:14 <robmur01> although my kmscube no longer prints the "Using modifier ..." line at all :/

18:16 <tlwoerner> i'm using the very tip of master of kmscube (e6386d1b99366ea7559438c0d3abd2ae2d6d61ac)

18:19 br_ has joined #panfrost

18:20 zkrx has joined #panfrost

18:23 <br_> robmur01: I think our pmap is mostly compatible with mali MMU, all I need is to change l3 pte a bit (revert AF bit and set permissions in Mali way).

18:24 <br_> It looks like mali is reading instructions from the jc provided by the first job

18:24 <br_> but then it seems it tries to access VA from another GEM buffer, and I get page fault at level 2

18:25 <br_> which is strage because it is mapped

18:26 <robmur01> br_: are you starting at level 0?

18:26 <br_> yes

18:26 <br_> so jc is at 0x3010740

18:26 <br_> l0 97b0000 val 934d003

18:26 <br_> l1 934d000 val 9553003

18:26 <br_> l2 95530c0 val d386d003

18:27 <br_> l3 d386d080 val de2300c1

18:27 <br_> this works.

18:27 <br_> But another address (0x4000000) does not work -> page fault at l2

18:27 <br_> l0 97b0000 val 934d003

18:27 <br_> l1 934d000 val 9553003

18:27 <br_> l2 9553100 val d385e003

18:27 <br_> l3 d385e000 val e68000c1

18:29 tlwoerner has quit [Remote host closed the connection]

18:29 tlwoerner has joined #panfrost

18:31 <alyssa> br_: FWIW, Mali doesn't have a command stream, it has a "job chain", i.e. a linked list of "job descriptor" data structures in GPU memory

18:31 <alyssa> jc = job chain

18:31 <alyssa> Kernel space doesn't really care, but it's good to keep in mind

18:32 <alyssa> In particular the execute bit on the MMU is only supposed to be set for shaders, not job descriptors.

18:32 <br_> I have page fault at l2 on write

18:35 <robmur01> which fault status code exactly?

18:38 <br_> panfrost_mmu_intr: fault status 7c2003c2

18:45 <robmur01> ah, so it's Mali "level 2" which actually means level 1, i.e. the exact same PTE at 934d000 :/

18:50 <br_> note that I dumped ptes after page fault received for both addresses

18:52 <anarsoul> alyssa: starting with Midgard

18:52 <anarsoul> Utgard does have a command stream

18:54 <alyssa> anarsoul: I know, br_ isn't working on Utgard though :)

18:54 <alyssa> as a general rule any toime I say Mali unqualified I mean Midgard or newer..

18:54 <anarsoul> right :)

18:55 <anarsoul> that's alyssa's Mali :)

18:56 rando25892 has joined #panfrost

18:57 <alyssa> that's me! :p

18:57 <alyssa> it's less confusing than the convention of calling Midgard+ just Midgard

18:58 <alyssa> as in "the Midgard driver for G76"

19:03 <robmur01> don't forget that unqualified "Mali" also includes display, video and camera :P

19:03 <anarsoul> hehe

19:04 <anarsoul> well, at least model numbering is not confusing

19:04 <anarsoul> for now

19:05 * robmur01 wonders if not having a command stream might actually start at the very beginning with Mali-55

19:06 <anarsoul> robmur01: well, it didn't have vertex shaders (so no cmd stream there) and I'm not sure about tielr

19:06 <anarsoul> *tiler

19:06 <robmur01> but I don't feel like going and looking up how fixed-function pipelines work when it's dinnertime... :P

19:11 <robmur01> anarsoul: bah, you win! The phrase "command stream" does appear exactly once in the Mali 55 TRM :)

19:11 <alyssa> Falanx days?

19:12 <anarsoul> it was fixed-function?

19:12 <anarsoul> I wonder how it evolved into mali200...

19:12 <alyssa> anarsoul: ES1 hardware!

19:12 <anarsoul> alyssa: it didn't need to be fixed-function :)

19:12 <alyssa> robmur01: Does Arm still sell the DDK for Mali 55? ;P

19:13 <robmur01> I'm gonna guess... noooooooo.

19:15 <alyssa> Planned obsolesence smh

19:15 <alyssa> ;p

19:15 <anarsoul> what about utgard?

19:15 <anarsoul> is it even supported? :)

19:16 <alyssa> anarsoul: in mesa! :p

19:16 <anarsoul> it's still REd driver :P

19:17 <robmur01> still an active product! https://www.arm.com/products/silicon-ip-multimedia/gpu/mali-470

19:17 <anarsoul> courtesy of libv and cwabbott :)

19:17 <anarsoul> btw, mali470 is not supported by lima

19:17 <anarsoul> I grepped the blob and it appears that 470 has some ISA changes

19:18 <anarsoul> but who knows, maybe collabora will pick up lima dev if there's enough interest from customers

19:18 <anarsoul> s/dev/development

19:29 zkrx has quit [Quit: I'm done]

19:33 zkrx has joined #panfrost

19:57 vstehle has quit [Quit: WeeChat 2.9]

20:26 zkrx has quit [Quit: I'm done]

20:30 Lyude has quit [Ping timeout: 240 seconds]

20:30 zkrx has joined #panfrost

20:31 davidlt has quit [Ping timeout: 240 seconds]

20:32 Lyude has joined #panfrost

20:45 raster has quit [Quit: Gettin' stinky!]

20:46 zkrx has quit [Quit: I'm done]

20:53 zkrx has joined #panfrost

21:02 zkrx has quit [Quit: I'm done]

21:07 zkrx has joined #panfrost

21:08 raster has joined #panfrost

21:13 vstehle has joined #panfrost

21:49 icecream95 has joined #panfrost

21:53 <tlwoerner> robmur01: whoa, "drm, surfaceless" are no longer available in mesa-20.3.1?

21:53 <macc24> icecream95: correction: there are 5 debian-specific lines in cadmium :D

21:53 <icecream95> tlwoerner: IIRC it's always enabled now

21:53 <tlwoerner> okay, so maybe "auto"?

21:54 <tlwoerner> phew

21:55 <icecream95> macc24: And as of three hours ago, one line that is liable to cause filesystem corruption…

21:55 <macc24> ???

21:55 <icecream95> mkfs.f2fs

21:55 <macc24> oh yeah this

21:55 <macc24> mkfs.f2fs doesn't like to re-mkfs.f2fs existing f2fs fs

21:56 <macc24> if cadmium build stuff gets to this point, the whole drive is already fucked so ¯\_(ツ)_/¯

21:58 <icecream95> I've been using f2fs root on my RK3288 system for a while, and it isn't *too* unstable, except that compression doesn't handle OOM well

21:59 <macc24> i (hopefully) disabled compression in cadmium kernel for both devices

22:10 <tlwoerner> icecream95: robmur01: thanks, upgrading to mesa-20.3.1 lets kmscube work again

22:11 alyssa has quit [Remote host closed the connection]

22:12 alyssa has joined #panfrost

22:14 <alyssa> icecream95: I take it you've procured a duet? :p

22:16 <tlwoerner> wow, i've never seen glmark2's [terrain] run so well. but there seems to be a problem with the last 3 tests [conditionals], [loop], and [function]

22:17 <alyssa> all four are broken upstream

22:19 <alyssa> ( https://github.com/glmark2/glmark2/pull/132 )

22:19 <icecream95> alyssa: Why would you think that?

22:27 tgall_foo has quit [Quit: My iMac has gone to sleep. ZZZzzz…]

22:33 <alyssa> icecream95: https://gitlab.freedesktop.org/icecream95/panfrost-perfcnt/-/commits/master

22:34 <alyssa> ---Also in the last seven minutes, I guess https://gitlab.freedesktop.org/icecream95/mesa/-/commits/aniso is a tell :p

22:34 <icecream95> alyssa: Okay, you win: https://gitlab.freedesktop.org/-/snippets/1383

22:34 <macc24> icecream95: have you ever heard about that linux distro for duet?

22:34 <macc24> it's called "cadmium"

22:35 <alyssa> icecream95: snazzy

22:35 * macc24 notices 5.10.0-rc4cadmium

22:38 tchebb has joined #panfrost

22:40 <alyssa> icecream95: also, if you want to move panfrost-perfcnt (etc) to gitlab.fd.o/panfrost, I can give you permissions

22:40 tgall_foo has joined #panfrost

22:40 <alyssa> (if you'd prefer to keep on your userspace, that's totally cool as well, just thought I'd mention it)

22:40 <alyssa> shorter slugs :p

22:41 * macc24 has a monopoly on linux for duet >:D

22:42 <alyssa> I still haven't got one, but with the number of times my life has been upended this year because of COVID, maybe that's for the better..

22:42 <icecream95> macc24: Especially as both times I tried building a kernel myself, it didn't boot

22:42 <macc24> icecream95: mainline kernel doesn't have dsi support for mt8183

22:43 <icecream95> macc24: That was compiling Cadmium with no changes

22:43 <macc24> ._.

22:43 <icecream95> Maybe it's because Void is still on GCC 9.3 - I

22:44 <macc24> i test on debian sid, notabug wontfix :>

22:44 <tchebb> Hi all! I'm running Panfrost on an RK3399 with Linux 5.10, and I've been seeing the following crashes, which result in all display output freezing, when under memory pressure: https://gist.github.com/tchebb/20932f46eacbef7e5847969a538ca1d8. Is this a known issue? The clients are Sway with Firefox and Alacritty, if that matters.

22:45 <icecream95> tchebb: What Mesa version are you using?

22:45 <macc24> icecream95: do you want option for void linux rootfs in cadmium?

22:45 <icecream95> macc24: I

22:45 <tchebb> icecream95: 20.3.0. The Mesa is an unmodified Arch Linux ARM build, the kernel I've built myself

22:49 raster has quit [Quit: Gettin' stinky!]

22:51 <icecream95> tchebb: Are you sure you aren't running any other versions of Mesa?

22:51 <macc24> icecream95: ill take that as a 'yes'

22:53 <tchebb> icecream95: Unless my package manager is lying to me, yes I'm sure. Checking pacman.log on this device, Mesa 20.3.0 was initially installed on the 12th when I first brought up the system and has not been upgraded or downgraded since. I don't even have the Mesa source tree cloned, so there's no chance I'm accidentally using a version other than the system one.

22:54 <tchebb> I have reproduced this issue 4 separate times, all while using Firefox, all with basically identical backtraces.

22:57 <icecream95> tchebb: Weird... Those messages seem very similar to https://gitlab.freedesktop.org/mesa/mesa/-/issues/3038, which was fixed months ago, long before 20.3

22:57 <macc24> tchebb: how to reproduce?

23:09 <alyssa> Maybe I should upgrade Mesa for myself finally.. :P

23:10 <tchebb> icecream95: yeah, I found https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aed44cbeae2b7674cd155ba5cc6506aafe46a94e and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e0cf7e9936c4358b0863357b90aa12afe6489da, which both appear to fix similar intermittent issues, but unfortunately I have both those commits in my kernel already

23:11 <tchebb> macc24: just browsing in Firefox usually triggers it within 30 minutes or so. Once the "Purging ..." messages start showing up in dmesg, it's just a matter of time.

23:12 <tchebb> I'm not familiar enough with the driver to know what starts those, but I assume it's related to overall system memory pressure

23:14 <alyssa> That is correct

23:16 <tchebb> Do you think there could be a race with the gpu_usecount check introduced by https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e0cf7e9936c4358b0863357b90aa12afe6489da (meaning it doesn't fix the issue completely)? The atomic_read() doesn't happen under any kind of mutex, which smells a bit fishy since it seems like there's nothing stopping it from passing when the purge starts but then failing by

23:16 <tchebb> the time the page actually gets freed. But maybe the overall state machine of the BO doesn't allow that to happen.

23:24 <alyssa> tchebb: Now you know why I stick to userspace :'(

23:30 <macc24> tchebb: ill try to reproduce tomorrow

23:30 <tchebb> macc24: Thanks! I'll also keep reading the code myself and see if I can figure out what's going on.

23:38 <alyssa> Arguably Mesa is badly behaved (my fault) but that shouldn't be able to take down the machine (kernel's fault)

23:43 <tchebb> Yeah, that was my thinking too. The failure seems pretty clear (page getting freed while something still has it mapped), so a fix should just be a matter of tracing the kernel code to figure out where that mapping gets created.

23:43 <tchebb> I don't understand Mesa at all, but I am pretty good at tracing kernel code :P

23:43 * macc24 noted

23:44 <alyssa> synergy.jpg

23:51 archetech has joined #panfrost