alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
tgall_foo has joined #panfrost
raster has quit [Quit: Gettin' stinky!]
archetech has quit [Quit: Konversation terminated!]
stikonas has quit [Remote host closed the connection]
archetech has joined #panfrost
warpme_ has quit [*.net *.split]
leper` has quit [*.net *.split]
cphealy has quit [*.net *.split]
jschwart has quit [*.net *.split]
milkii has quit [*.net *.split]
Stary has quit [*.net *.split]
lvrp16 has quit [*.net *.split]
dschuermann has quit [*.net *.split]
Stenzek has quit [*.net *.split]
mifritscher has quit [*.net *.split]
chrisf has quit [*.net *.split]
enty has joined #panfrost
ente has quit [Read error: Connection reset by peer]
popolon has quit [Quit: WeeChat 2.9]
warpme_ has joined #panfrost
milkii has joined #panfrost
jschwart has joined #panfrost
dschuermann has joined #panfrost
Stary has joined #panfrost
chrisf has joined #panfrost
Stenzek has joined #panfrost
lvrp16 has joined #panfrost
leper` has joined #panfrost
cphealy has joined #panfrost
mifritscher has joined #panfrost
<HdkR> woo
SolidHal has quit [Quit: Ping timeout (120 seconds)]
<anarsoul> alyssa: lima doesn't support mali200
<anarsoul> it supports mali450 though
<anarsoul> we didn't port any quirks for it from libv's driver
<anarsoul> thanks for mentioning lima anyway :)
<alyssa> anarsoul: We accept patches :p
kaspter has joined #panfrost
vstehle has quit [Ping timeout: 264 seconds]
kaspter has quit [Ping timeout: 260 seconds]
kaspter has joined #panfrost
hl has quit [Ping timeout: 256 seconds]
hl has joined #panfrost
archetech has quit [Quit: Konversation terminated!]
nlhowell has joined #panfrost
tgall_foo has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
vstehle has joined #panfrost
davidlt has joined #panfrost
tomboy64 has quit [Ping timeout: 240 seconds]
tomboy64 has joined #panfrost
karolherbst has joined #panfrost
raster has joined #panfrost
<tomeu> alyssa: regarding the INSTR_ENC flakiness, maybe we could catch such exceptions and dump the shader to a file?
<tomeu> then diff it with the same shader when it succeeds
tgall_foo has joined #panfrost
camus has joined #panfrost
stikonas has joined #panfrost
kaspter has quit [Ping timeout: 240 seconds]
camus is now known as kaspter
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #panfrost
dstzd has quit [Quit: ZNC - https://znc.in]
jernej has quit [Client Quit]
jernej has joined #panfrost
kaspter has quit [Ping timeout: 268 seconds]
kaspter has joined #panfrost
dstzd has joined #panfrost
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<robmur01> unless of course it's a GPU-side caching issue - in that case it might possibly be worth trying to have a GPU job memcpy the shader somewhere else and flush that out properly, so you can then maybe get some idea of what it saw vs. what was supposed to be there
<macc24> alyssa: wooo, documentation
karolherbst has quit [Remote host closed the connection]
Venemo has quit [Quit: ZNC 1.7.3 - https://znc.in]
Venemo has joined #panfrost
Venemo has quit [Quit: ZNC 1.8.2 - https://znc.in]
Venemo has joined #panfrost
kaspter has quit [Quit: kaspter]
<alyssa> robmur01: Purely compiler side changes shouldn't affect that.. I hope
<robmur01> sure, Occam's razor says the first and only thing to do at this point is confirm whether we really are always generating what we think we're generating
<robmur01> I just like to think ahead :)
<alyssa> :)
<alyssa> robmur01: Botched packing.
<alyssa> Slept on it and it was obvious :)
<robmur01> well yeah, it's probably all squashed if you slept on it :P
<alyssa> ?
<urjaman> too early for alyssa? :P
<alyssa> urjaman: little bit :p
<urjaman> (the packing, if you slept on it ... )
<alyssa> ah
<alyssa> Hmmm
<alyssa> Goes away if I force constants with every clause but that's.. wasteful..
<alyssa> oh, that's embarassing
* alyssa was missing message types on a few instructions
<alyssa> though... still having issues...
kaspter has joined #panfrost
dstzd has quit [Quit: ZNC - https://znc.in]
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
dstzd has joined #panfrost
dstzd has quit [Client Quit]
dstzd has joined #panfrost
dstzd has quit [Client Quit]
dstzd has joined #panfrost
jernej has joined #panfrost
karolherbst has joined #panfrost
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<kinkinkijkin> i now have access to a fire tablet 7
<kinkinkijkin> 2019 i believe
<alyssa> 2019.. those were the days
popolon has joined #panfrost
<alyssa> bbrezillon: Can you confirm that dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_vertex is failing even with your fixes?
<alyssa> Trying to make sure this isn't a regression
<macc24> kinkinkijkin: cool
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<alyssa> daniels: InfrastructureError? https://gitlab.freedesktop.org/alyssa/mesa/-/jobs/6231655
<bbrezillon> alyssa: will do, I just want to finish something on the AFBC series first
<alyssa> 👍
<alyssa> bbrezillon: no rush.. probably will run out of time for the week shortly though.. not sure how we want to handle that with christmas coming up
<daniels> alyssa: line 608 -> 'network gremlins, couldn't pull my kernel'
<alyssa> I just write compilers 🤷
<daniels> -> whack retry (I did)
<macc24> does ping reach the machine consistently?
alyssa has quit [Remote host closed the connection]
alyssa has joined #panfrost
<alyssa> Pass: 15241, Fail: 1069, Crash: 4, UnexpectedPass: 1,
<alyssa> 93%, that's still an A! ;p
<macc24> only 4 crashes!
<alyssa> dEQP-GLES2.functional.uniform_api.random.79 is the unexpected pass
<macc24> what's dEQP?
<HdkR> Flakey network? Nobody has time for that.
<HdkR> Also me, my wifi access point restarts every month automatically
<HdkR> At least that is only two minutes of down time and a few hours of inconsistent performance for non essential devices...
zkrx has quit [Quit: I'm done]
nlhowell has quit [Ping timeout: 246 seconds]
<bbrezillon> alyssa: I'll try to review/test your MR tomorrow (I was planning to do it today, but ended up fixing other AFBC issues)
<alyssa> bbrezillon: definitely not ready for reviewing let alone testing
<alyssa> Just wanted to let people know it's there
<alyssa> handling swizzles well will involve rewriting a lot of the history again
<macc24> who said testing?
<alyssa> bbrezillon: did
<alyssa> macc24: test his AFBC code o:)
<macc24> where
<macc24> bifrost?
<daniels> macc24: ping doesn't reach the machine consistently because it's mostly powered off
kaspter has quit [Quit: kaspter]
nlhowell has joined #panfrost
nlhowell has quit [Client Quit]
<alyssa> how does that work? :o
<alyssa> oh, I guess LAVA controls power
<daniels> indeed
<daniels> because no use booting on a device where someone has terminally wedged the GPU
<daniels> or put it into swap death
<daniels> or anything
nlhowell has joined #panfrost
<alyssa> oh hey I do both of things on the daily ;p
<macc24> someone (probably) broke kexec on arm64 :(
<daniels> shrug, kexec just adds complexity rather than improves the situation
<macc24> kexec is really useful for me
<macc24> i don't have to dd kernel onto kernel partition and hope that it boots uip
<macc24> oh, and kinkinkijkin, don't get attached to your current cadmium installation ;), it's likely that cadmium will get support for updating kernel with multiple partitions to have theoretically unbrickable device
rando25892 has quit [Read error: Connection reset by peer]
<tlwoerner> is there a trick to getting kmscube running on a rock-pi-4?
<macc24> kmscube -d /dev/dri/card1?
<tlwoerner> macc24: device is -D, but same
<macc24> uhh
<macc24> i think i left stove on at home
macc24 has left #panfrost ["WeeChat 2.9"]
<tlwoerner> lol :-)
macc24 has joined #panfrost
karolherbst has quit [Quit: duh 🐧]
<robmur01> tlwoerner: IIRC at some point not too long ago there was some impedance mismatch between Mesa and the Rockchip DRM driver around modifiers and AFBC
<robmur01> it's working fine on my RK3399 with Mesa 20.3 and kernel 5.8
<tlwoerner> robmur01: i'm using mesa 20.2.4 and kernel 5.8.18
<tlwoerner> robmur01: any pointers where i can start looking?
<tlwoerner> robmur01: ahh, sorry, i'll try a more recent mesa (duh!)
<robmur01> although my kmscube no longer prints the "Using modifier ..." line at all :/
<tlwoerner> i'm using the very tip of master of kmscube (e6386d1b99366ea7559438c0d3abd2ae2d6d61ac)
br_ has joined #panfrost
zkrx has joined #panfrost
<br_> robmur01: I think our pmap is mostly compatible with mali MMU, all I need is to change l3 pte a bit (revert AF bit and set permissions in Mali way).
<br_> It looks like mali is reading instructions from the jc provided by the first job
<br_> but then it seems it tries to access VA from another GEM buffer, and I get page fault at level 2
<br_> which is strage because it is mapped
<robmur01> br_: are you starting at level 0?
<br_> yes
<br_> so jc is at 0x3010740
<br_> l0 97b0000 val 934d003
<br_> l1 934d000 val 9553003
<br_> l2 95530c0 val d386d003
<br_> l3 d386d080 val de2300c1
<br_> this works.
<br_> But another address (0x4000000) does not work -> page fault at l2
<br_> l0 97b0000 val 934d003
<br_> l1 934d000 val 9553003
<br_> l2 9553100 val d385e003
<br_> l3 d385e000 val e68000c1
tlwoerner has quit [Remote host closed the connection]
tlwoerner has joined #panfrost
<alyssa> br_: FWIW, Mali doesn't have a command stream, it has a "job chain", i.e. a linked list of "job descriptor" data structures in GPU memory
<alyssa> jc = job chain
<alyssa> Kernel space doesn't really care, but it's good to keep in mind
<alyssa> In particular the execute bit on the MMU is only supposed to be set for shaders, not job descriptors.
<br_> I have page fault at l2 on write
<robmur01> which fault status code exactly?
<br_> panfrost_mmu_intr: fault status 7c2003c2
<robmur01> ah, so it's Mali "level 2" which actually means level 1, i.e. the exact same PTE at 934d000 :/
<br_> note that I dumped ptes after page fault received for both addresses
<anarsoul> alyssa: starting with Midgard
<anarsoul> Utgard does have a command stream
<alyssa> anarsoul: I know, br_ isn't working on Utgard though :)
<alyssa> as a general rule any toime I say Mali unqualified I mean Midgard or newer..
<anarsoul> right :)
<anarsoul> that's alyssa's Mali :)
rando25892 has joined #panfrost
<alyssa> that's me! :p
<alyssa> it's less confusing than the convention of calling Midgard+ just Midgard
<alyssa> as in "the Midgard driver for G76"
<robmur01> don't forget that unqualified "Mali" also includes display, video and camera :P
<anarsoul> hehe
<anarsoul> well, at least model numbering is not confusing
<anarsoul> for now
* robmur01 wonders if not having a command stream might actually start at the very beginning with Mali-55
<anarsoul> robmur01: well, it didn't have vertex shaders (so no cmd stream there) and I'm not sure about tielr
<anarsoul> *tiler
<robmur01> but I don't feel like going and looking up how fixed-function pipelines work when it's dinnertime... :P
<robmur01> anarsoul: bah, you win! The phrase "command stream" does appear exactly once in the Mali 55 TRM :)
<alyssa> Falanx days?
<anarsoul> it was fixed-function?
<anarsoul> I wonder how it evolved into mali200...
<alyssa> anarsoul: ES1 hardware!
<anarsoul> alyssa: it didn't need to be fixed-function :)
<alyssa> robmur01: Does Arm still sell the DDK for Mali 55? ;P
<robmur01> I'm gonna guess... noooooooo.
<alyssa> Planned obsolesence smh
<alyssa> ;p
<anarsoul> what about utgard?
<anarsoul> is it even supported? :)
<alyssa> anarsoul: in mesa! :p
<anarsoul> it's still REd driver :P
<anarsoul> courtesy of libv and cwabbott :)
<anarsoul> btw, mali470 is not supported by lima
<anarsoul> I grepped the blob and it appears that 470 has some ISA changes
<anarsoul> but who knows, maybe collabora will pick up lima dev if there's enough interest from customers
<anarsoul> s/dev/development
zkrx has quit [Quit: I'm done]
zkrx has joined #panfrost
vstehle has quit [Quit: WeeChat 2.9]
zkrx has quit [Quit: I'm done]
Lyude has quit [Ping timeout: 240 seconds]
zkrx has joined #panfrost
davidlt has quit [Ping timeout: 240 seconds]
Lyude has joined #panfrost
raster has quit [Quit: Gettin' stinky!]
zkrx has quit [Quit: I'm done]
zkrx has joined #panfrost
zkrx has quit [Quit: I'm done]
zkrx has joined #panfrost
raster has joined #panfrost
vstehle has joined #panfrost
icecream95 has joined #panfrost
<tlwoerner> robmur01: whoa, "drm, surfaceless" are no longer available in mesa-20.3.1?
<macc24> icecream95: correction: there are 5 debian-specific lines in cadmium :D
<icecream95> tlwoerner: IIRC it's always enabled now
<tlwoerner> okay, so maybe "auto"?
<tlwoerner> phew
<icecream95> macc24: And as of three hours ago, one line that is liable to cause filesystem corruption…
<macc24> ???
<icecream95> mkfs.f2fs
<macc24> oh yeah this
<macc24> mkfs.f2fs doesn't like to re-mkfs.f2fs existing f2fs fs
<macc24> if cadmium build stuff gets to this point, the whole drive is already fucked so ¯\_(ツ)_/¯
<icecream95> I've been using f2fs root on my RK3288 system for a while, and it isn't *too* unstable, except that compression doesn't handle OOM well
<macc24> i (hopefully) disabled compression in cadmium kernel for both devices
<tlwoerner> icecream95: robmur01: thanks, upgrading to mesa-20.3.1 lets kmscube work again
alyssa has quit [Remote host closed the connection]
alyssa has joined #panfrost
<alyssa> icecream95: I take it you've procured a duet? :p
<tlwoerner> wow, i've never seen glmark2's [terrain] run so well. but there seems to be a problem with the last 3 tests [conditionals], [loop], and [function]
<alyssa> all four are broken upstream
<icecream95> alyssa: Why would you think that?
tgall_foo has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
<alyssa> ---Also in the last seven minutes, I guess https://gitlab.freedesktop.org/icecream95/mesa/-/commits/aniso is a tell :p
<icecream95> alyssa: Okay, you win: https://gitlab.freedesktop.org/-/snippets/1383
<macc24> icecream95: have you ever heard about that linux distro for duet?
<macc24> it's called "cadmium"
<alyssa> icecream95: snazzy
* macc24 notices 5.10.0-rc4cadmium
tchebb has joined #panfrost
<alyssa> icecream95: also, if you want to move panfrost-perfcnt (etc) to gitlab.fd.o/panfrost, I can give you permissions
tgall_foo has joined #panfrost
<alyssa> (if you'd prefer to keep on your userspace, that's totally cool as well, just thought I'd mention it)
<alyssa> shorter slugs :p
* macc24 has a monopoly on linux for duet >:D
<alyssa> I still haven't got one, but with the number of times my life has been upended this year because of COVID, maybe that's for the better..
<icecream95> macc24: Especially as both times I tried building a kernel myself, it didn't boot
<macc24> icecream95: mainline kernel doesn't have dsi support for mt8183
<icecream95> macc24: That was compiling Cadmium with no changes
<macc24> ._.
<icecream95> Maybe it's because Void is still on GCC 9.3 - I
<macc24> i test on debian sid, notabug wontfix :>
<tchebb> Hi all! I'm running Panfrost on an RK3399 with Linux 5.10, and I've been seeing the following crashes, which result in all display output freezing, when under memory pressure: https://gist.github.com/tchebb/20932f46eacbef7e5847969a538ca1d8. Is this a known issue? The clients are Sway with Firefox and Alacritty, if that matters.
<icecream95> tchebb: What Mesa version are you using?
<macc24> icecream95: do you want option for void linux rootfs in cadmium?
<icecream95> macc24: I
<tchebb> icecream95: 20.3.0. The Mesa is an unmodified Arch Linux ARM build, the kernel I've built myself
raster has quit [Quit: Gettin' stinky!]
<icecream95> tchebb: Are you sure you aren't running any other versions of Mesa?
<macc24> icecream95: ill take that as a 'yes'
<tchebb> icecream95: Unless my package manager is lying to me, yes I'm sure. Checking pacman.log on this device, Mesa 20.3.0 was initially installed on the 12th when I first brought up the system and has not been upgraded or downgraded since. I don't even have the Mesa source tree cloned, so there's no chance I'm accidentally using a version other than the system one.
<tchebb> I have reproduced this issue 4 separate times, all while using Firefox, all with basically identical backtraces.
<icecream95> tchebb: Weird... Those messages seem very similar to https://gitlab.freedesktop.org/mesa/mesa/-/issues/3038, which was fixed months ago, long before 20.3
<macc24> tchebb: how to reproduce?
<alyssa> Maybe I should upgrade Mesa for myself finally.. :P
<tchebb> icecream95: yeah, I found https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aed44cbeae2b7674cd155ba5cc6506aafe46a94e and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e0cf7e9936c4358b0863357b90aa12afe6489da, which both appear to fix similar intermittent issues, but unfortunately I have both those commits in my kernel already
<tchebb> macc24: just browsing in Firefox usually triggers it within 30 minutes or so. Once the "Purging ..." messages start showing up in dmesg, it's just a matter of time.
<tchebb> I'm not familiar enough with the driver to know what starts those, but I assume it's related to overall system memory pressure
<alyssa> That is correct
<tchebb> Do you think there could be a race with the gpu_usecount check introduced by https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e0cf7e9936c4358b0863357b90aa12afe6489da (meaning it doesn't fix the issue completely)? The atomic_read() doesn't happen under any kind of mutex, which smells a bit fishy since it seems like there's nothing stopping it from passing when the purge starts but then failing by
<tchebb> the time the page actually gets freed. But maybe the overall state machine of the BO doesn't allow that to happen.
<alyssa> tchebb: Now you know why I stick to userspace :'(
<macc24> tchebb: ill try to reproduce tomorrow
<tchebb> macc24: Thanks! I'll also keep reading the code myself and see if I can figure out what's going on.
<alyssa> Arguably Mesa is badly behaved (my fault) but that shouldn't be able to take down the machine (kernel's fault)
<tchebb> Yeah, that was my thinking too. The failure seems pretty clear (page getting freed while something still has it mapped), so a fix should just be a matter of tracing the kernel code to figure out where that mapping gets created.
<tchebb> I don't understand Mesa at all, but I am pretty good at tracing kernel code :P
* macc24 noted
<alyssa> synergy.jpg
archetech has joined #panfrost