<guillaume_g>
it seems to fail a panfrost_gpu_soft_reset
<alyssa>
Although that does seem to be a legitimate bug in the kernel
<alyssa>
robher: ^^
<guillaume_g>
alyssa: ok. What is missing?
<alyssa>
tomeu: SET_VALUE jobs reset the polygon list; TILER jobs add to the polygon list; FRAGMENT jobs read the polygon list
<alyssa>
tomeu: So if you have no draws, there is no TILER but also no SET_VALUE.
<guillaume_g>
maybe my DTB fragment is wrong.
<alyssa>
tomeu: .....Meaning your fragment-only frame will actually end up redrawing whatever you drew last frame!
<tomeu>
alyssa: ah, but the CPU writes to it :)
<alyssa>
tomeu: So the blobl's apparent solution is to have the actual per-FBO poltgon list, and also a dummy empty one they keep. They switch them out depending if there are draws
<alyssa>
tomeu: Well, it's possible that on T720, the internal polygon list structures were invalid if zeroed out
<tomeu>
yeah, that's how it looks
<alyssa>
But with a field or two in the header set, they became valid but stil empty
<alyssa>
Regardless, you really do need the tiler_dummy
<tomeu>
ah, haven't looked in the header
<robher>
guillaume_g, alyssa: need to look at kbase and see if there are any reset related errata for t604. I don't recall any, but there's lots for t604 in snow. Reset needs to work well on it because my understanding is that h/w has to be reset several times a second.
<alyssa>
robher: That's terrifying.
<tomeu>
well, there's a bunch of microseconds in a second :p
<robher>
Can we just buy everyone that asks about snow a new chromebook...
<tomeu>
works for me, I'm not looking forward to debugging panfrost on t6xx with 64-bit descriptors...
<alyssa>
Juno :o
<HdkR>
Someone still has a juno?
<HdkR>
Madness
davidlt has quit [Remote host closed the connection]
<tomeu>
hmm, not sure there's a 64-bit DDK for Juno
davidlt has joined #panfrost
<alyssa>
tomeu: I don't see the difference with depth/stencil? The defaults between Gallium and libmali are probably different without going out of spec
<alyssa>
tomeu: I thought Juno *is* 64-bit
<tomeu>
alyssa: yeah, I think you can ignore all my suggestions before
<tomeu>
I'm really out of ideas :)
<alyssa>
tomeu: I'm lookin
<alyssa>
First weirdness:
<alyssa>
Why on *earth* is the blob allocating the polygon list as executable?
<alyssa>
vt_sfbd.clear_flags differ
<alyssa>
workgroups_z_shift differs but I doubt that affects anything
<alyssa>
tomeu: vertex payload's gl_enables differ <------ this one probably matters
<guillaume_g>
alyssa, robher: adding OPP, I do not have a crash anymore, but still external aborts: https://pastebin.com/kGnq1L3d
chewitt has quit [Quit: Zzz..]
<tomeu>
I seem to have broken CI and have no idea how
<robher>
guillaume_g: Looks to me like there's a problem accessing the GPU registers. Probably something else needs to be enabled and that could be exynos specific. There were some patches on the list for exynos support on panfrost. Do you have those?
chewitt has joined #panfrost
chewitt has quit [Client Quit]
hanetzer has quit [Changing host]
hanetzer has joined #panfrost
<guillaume_g>
robher: no, I am using kernel 5.2.0 atm.
<guillaume_g>
robher: dou you have a link, please?
<guillaume_g>
*do
<tomeu>
alyssa: well, the mir_foreach_instr_in_block_safe patch seems to have broken everything
chewitt has joined #panfrost
<robher>
guillaume_g: "[PATCH v2 3/7] arm64: dts: exynos: Add GPU/Mali T760 node to Exynos5433", but I don't see anything extra needed on other chips (there's no 5250 support in the series though).
<robher>
guillaume_g: maybe one of the exynos folks can help. It's going to be some clock, regulator, or power domain most likely.
<robher>
guillaume_g: Or you just have the wrong address.
<chewitt>
and its still shows the same "panfrost: probe of d00c0000.gpu failed with error -12"
<guillaume_g>
robher: I will switch to the arndale board, at least I will have a serial to debug things.
<chewitt>
tomeu: narmstrong: so I have to conclude it's something external to panfrost (something else in the kernel changed)
<chewitt>
one last thing to try is an aarch64 image
<chewitt>
as we normally build 'arm'
<tomeu>
ah, guess that could be it
<tomeu>
alyssa: guess I will have to revert it in the meantime
<guillaume_g>
robher: adding pd_g3d to power-domains make the system to freeze :(
<robher>
guillaume_g: perhaps the PD is on already, and then on probe failure it gets turned off.
raster has quit [Remote host closed the connection]
<guillaume_g>
robher: and it could freeze the board?
<robher>
guillaume_g: certainly if something else is relying on the PD to be default enabled.
<guillaume_g>
robher: ok. The last line I have before the freeze is "panfrost 11800000.gpu: clock rate = 200000000"
<robher>
guillaume_g: I think also there's some issues in the clean-up error paths in the panfrost driver interacting with runtime-pm. I'm not certain though.
<chewitt>
tomeu: narmstrong: for kicks I fully reverted panfrost and then cherry-picked the commit that added panfrost to our 5.1 kernel sources .. and this works
<chewitt>
so i've exported the two commits as patch files .. now doing 'diff -y' to compare them
<guillaume_g>
robher: it seems that upstream dts fragments have only one clock, whereas my downstream reference code has multiple clocks
<guillaume_g>
robher: it seems I am missing some configuration for the main clock :(
<chewitt>
tomeu: narmstrong: robher: this is the (inverse) diff between the "add panfrost" 5.1 kernel patch and the initial 5.2 commit merged in mainline
<chewitt>
could be my bad copy/pasting .. but something there is why T820 (S912) stopped working after the 5.2 bump
<robher>
chewitt: there was some issue with 32-bit GPU VA being rejected by io-pgtable code. I'm not sure if Robin ever got a fix in.
<chewitt>
we run a split 64/32 arrangement so kernel is 64-bot
<chewitt>
s/bot/bit
<alyssa>
tomeu: Hrm
<chewitt>
alyssa: with the patch hacking .. I still see the black text problem I described before
<guillaume_g>
robher: I got it "working". I mean the driver does not crash on boot. ;) https://pastebin.com/w9F6KKYt My problem was wrong clock + missing power-domain
<guillaume_g>
robher: I hope what is printed makes sense
gtucker has quit [Ping timeout: 252 seconds]
tomeu has quit [Ping timeout: 252 seconds]
<alyssa>
chewitt: :/
<alyssa>
chewitt: Could you send the output with "MIDGARD_MESA_DEBUG=shaders" set?
<alyssa>
And if I can't figure it out from that, maybe even with "PAN_MESA_DEBUG=trace" set (but it will be very large, so quit Kodi as soon as the black text appears)
<alyssa>
It's just odd given that it worked fine before and our conformance numbers have been steadily _increasing_
<calcprogrammer1>
I flashed a fresh install of Debian (unofficial arm64 this time) to my Rock Pi 4 and then built kernel, libdrm, and mesa as I was before. Can't confirm yet but it appears to be working now, no dmesg errors when running lightdm or kmscube. I'm remoted in right now so can't see screen. Are there any known issues with running a 32-bit distribution with Panfrost on arm64 hardware (rk3399)? Radxa's official Debian
<calcprogrammer1>
build for the Rock Pi 4 is a 32-bit Debian armhf with an aarch64 kernel.
<alyssa>
calcprogrammer1: 32-on-64 is rather broken right now, but we're working on it!
<alyssa>
LibreELEC has a workaround but I don't recommend it at this time
stikonas has joined #panfrost
stikonas has quit [Read error: Connection reset by peer]
stikonas has joined #panfrost
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
BenG83 has joined #panfrost
davidlt has quit [Ping timeout: 244 seconds]
MistahDarcy has quit [Quit: Leaving]
jcureton has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
<jcureton>
is there anything being done toward a platform quirks framework? i know people have added some vendor-specific compat flags but i haven't seen any patches toward building it
<jcureton>
^ within the kernel drm driver
jcureton has quit [Remote host closed the connection]
stikonas has quit [Remote host closed the connection]
jcureton has joined #panfrost
<robher>
jcureton: didn't know we needed one.
<jcureton>
robher: i'm trying to figure out if there's generically a need for one. i'm working on getting a T720 running as a backport on an SoC that definitely needs to handle some platform quirks. i've seen some conversation around amlogic devices also having some oddities
<robher>
jcureton: We need to see what the changes needed are first, then we can decide if a 'framework' is needed. Sounds like overkill is my first thought.
<jcureton>
the above is a simple one, my requirement is a bit broader requiring poking quite a few registers on my SoC outside of the GPU address space.
<jcureton>
no issues maintaining mine in my tree, but if there's a wider need to dealing with platform-specifics i can try to make it upstreamable
<chewitt>
alyssa: output from MIDGARD_MESA_DEBUG=shaders => http://ix.io/1Oca
<chewitt>
let me know if you need the other output
<chewitt>
partial output form PAN_MESA_DEBUG=trace => http://ix.io/1Oce
<chewitt>
partial because it overflows the weeny journal buffer before I can login to stop kodi
<alyssa>
chewitt: *eyes*
<chewitt>
anything useful there ^ ?
<alyssa>
chewitt: It'll take me a bit to chew through so maybe?!
<chewitt>
happy hunting :)
<alyssa>
:)
<alyssa>
Here's sth
<alyssa>
shift/extra_flags getting set for LINEAR
<alyssa>
Probably not the issue but it's semantically nonsense and should be fixed
<alyssa>
(Set to ~0)
<alyssa>
for both attrs and varyings..
<alyssa>
chewitt: I'm just mighty confused given our conformance status and Kodi is a very normal GLES app
<alyssa>
chewitt: Ohhhhhhh I also did some work on blending, I wonder if that's messing with something
<alyssa>
That said, the blend mode used in that log is the same on my local Kodi which works fine so
<alyssa>
chewitt: Question: Is this a 32-bit build with the LE Panfrost patch or honest-to-goodness 64-bit?
<chewitt>
64-bit kernel and 32-bit userspace
<chewitt>
standard LE config
<alyssa>
Alright, that helps
<alyssa>
(Well, it doesn't help the problem, but it might help me narrow the problem :P)
<alyssa>
I'm guessing 32-bit support is buggy in some way, trying to think the easiest way to debug this
<alyssa>
Let me check if CI keeps interesting logs
<chewitt>
I can make an aarch64 image and test again .. but that will be overnight
<alyssa>
If it's not too much trouble, that would certainly help narrow down the issue :)
<alyssa>
But I know of a few outstanding 32-bit-related issues which I can bump the prio and look into now :)
<alyssa>
Best case, it solves your bug
<alyssa>
Worst case, it solves a bug you didn't know you had :p
<chewitt>
sounds like a good idea
BenG83 has quit [Quit: Leaving]
<alyssa>
chewitt: I'm collecting possible fixes in tomeu/fix32
<alyssa>
Er, tomeu/mesa branch fix32
<alyssa>
There's a small chance that branch will magically work better