buzzmarshall has quit [Remote host closed the connection]
kinkinkijkin has joined #panfrost
Green has joined #panfrost
Green has quit [Quit: Ping timeout (120 seconds)]
Green has joined #panfrost
vstehle has joined #panfrost
rcf has quit [Quit: WeeChat 2.7]
rcf has joined #panfrost
rcf has quit [Client Quit]
rcf has joined #panfrost
Elpaulo has quit [Quit: Elpaulo]
nerdboy has joined #panfrost
icecream95 has quit [Quit: leaving]
kinkinkijkin has quit [Remote host closed the connection]
NeuroScr has quit [Ping timeout: 240 seconds]
NeuroScr has joined #panfrost
yann has joined #panfrost
yann has quit [Ping timeout: 272 seconds]
NeuroScr has quit [Read error: Connection reset by peer]
NeuroScr has joined #panfrost
raster has joined #panfrost
<la-s>
alyssa: how would I debug bad GPU performance? I am using sway now, and though it works mostly great (the background has some graphical glitches), the performance is still not great, just as with weston.
<la-s>
was thinking of trying to fix it myself
nerdboy has quit [Ping timeout: 264 seconds]
<tomeu>
la-s: first step is figuring out if the bottleneck is cpu or gpu
<la-s>
good point yeah
<la-s>
should figure out how to profile sway
adjtm has joined #panfrost
Green has quit [Ping timeout: 256 seconds]
adjtm_ has quit [Ping timeout: 256 seconds]
Green has joined #panfrost
<tomeu>
well, if it's gpu, then you can look at performance counters to figure out why
<tomeu>
but if it's cpu, then something like perf top could give an indication quite quickly
<alyssa>
robher: I'm seeing some pretty serious regressions in 5.6 (from 5.4)
<alyssa>
Easy reproduction: open weston and run glmark2-es2-wayland -bterrain
<alyssa>
(Or even -bshadow)
<alyssa>
Anything that uses FBOs is hosed.
<alyssa>
Even glmark2-es2-drm -bterrain (w/o a display manager) reproduces.
yann has joined #panfrost
<alyssa>
I've downgraded to 5.4 in the meantime.
<urjaman>
is that the same thing that i have with 5.7 (rc any) or less severe? (it complains a bunch, fails to reset the gpu, and eventually just kinda hangs the process doing the GPU stuff)
<urjaman>
and yeah i jumped from 5.4 to 5.7rc so it could be introduced in 5.6 all i know
<alyssa>
urjaman: not sure, try the above repro (super obvious with weston)
<urjaman>
... i'ma build glmark2 then ...
<alyssa>
urjaman: fair enough :p
<alyssa>
it's a fast build, dw
<urjaman>
yeah more surprised i havent used it before
<urjaman>
i've legit just been super lazy since 5.4 works fine for me :P
<alyssa>
relatable
<urjaman>
umm i'll update mesa too first
<urjaman>
i did a for comparison test of doing the -bterrain on weston and got weston crashing after a few seconds of running that and a "pan_bo.c:176: pan_bucket_index: Assertion 'bucket_index >= MIN_BO_CACHE_BUCKET' failed" in the terminal
<urjaman>
(comparison on 5.4 that is...)
<urjaman>
i assume that's fixed already but like whoops
<urjaman>
good idea to do a control test first :P
<alyssa>
uhhhh
<urjaman>
we'll see after about some 800 objects by this lap warmer of a C201 :P
<urjaman>
yep updated mesa, and this repro runs fine on 5.4, now to reboot into 5.7rcsomething ÖP
<urjaman>
*:P
<alyssa>
Nyoof
<urjaman>
okay interesting ... it flickered white like a handful of times and dmesg shows a bunch of gpu sched timeouts and 2+ faults
<urjaman>
actually two faults and one "There were multiple GPU faults - some have not been reported"
<alyssa>
urjaman: That sounds about right
<alyssa>
I mean wrong but
<urjaman>
and now i need to ssh in to restart this thing because i tried to start my Xorg session (just to confirm it still fails the same way-ish i guess... yup it was laggy and then hung a bit after starting firefox, same as before)
<urjaman>
... i suppose that was pointless since the kernel doesnt manage to reboot from a "reboot -f" in this state
<Lyude>
alyssa: sounds like it's time for a bisect?
<alyssa>
I mean wrong but
<alyssa>
uhm
<alyssa>
silly arrow keys
<alyssa>
Lyude: probably, yeah. though there haven't been many changes, so
<Lyude>
might be a change outside of panfrost maybe
davidlt has quit [Ping timeout: 260 seconds]
<alyssa>
Perhaps
<robmur01>
hmm, -ENOREPRO here: 5.4-rc7 and glmark2-es2-drm runs all the way through just fine
<bbrezillon>
robmur01: the problem is on 5.6+
<robmur01>
derp, that was supposed to say 5.7-rc4
<robmur01>
been playing with Firefox under GDM with 5.6/5.7-rc with no issue either
<alyssa>
Anyway, I have thousands of conformance fails to fix for fp16 now. tata :p
<urjaman>
my kernel building process isnt really set up for bisecting :/
<urjaman>
i guess i could set something up, but like that sounds like work
<urjaman>
i guess i should check with 5.7-rc4 for completeness too (my last one was rc3)
<urjaman>
but right now upgrading the Arch linux on my C201 (since i realized that was over a month old too)
<alyssa>
Ahhh working in Weston feels so different after being in GNOME for so long
<urjaman>
somehow the situation(TM) feels like time doesnt exist (and isnt really moving) but then suddenly you havent updated your linuces in a month+
<alyssa>
urjaman: I had a terrible nightmare a few days ago where there was a worldwide pandemic
<urjaman>
alyssa: how do you distinguish that from reality tho
<alyssa>
I was asleep.
<urjaman>
ah yeah that bit
<alyssa>
fails.txt is 1519 lines long, wee. but just fixed a bunch
<alyssa>
so down to 1438 :P
<alyssa>
er 1133, one thing fixed a bunch
* alyssa
under 1000 in her to-triage list, this is going faster than expected :~)
<bbrezillon>
alyssa: same as robmur01, works fine here with mesa/master and linux/master (AKA 5.7-rc4)
<alyssa>
bbrezillon: Maybe something was fixed between 5.6.1 and master?
<bbrezillon>
I can test on 5.6.1
<alyssa>
vmlinuz-5.6.0-1-arm64 from deabin
<bbrezillon>
ok, so 5.6
<bbrezillon>
alyssa: and I did not test things extensively, just ran glmark2 under weston
<alyssa>
also not sure why I'm not seeing a statistically siginficant fps difference with fp16 on glmark
<alyssa>
I guess except for -bterrain, register pressure isn't the bottleneck since they're simple enough
<HdkR>
Not bounded by ALU? :)
<alyssa>
HdkR: Well, lower pressure ==> more threads in flight
<HdkR>
Ah right
<alyssa>
But if it's memory bound, well.
<HdkR>
Sounds like we just need more SoCs with >100GB/s memory bandwidth
<robmur01>
Oh FFS... how do we keep forgetting this? :P
<robmur01>
what does -bterrain do? pretty much guarantee running at max OPP
<robmur01>
what landed since 5.4? The generic OPP support that broke voltage scaling :(
<robmur01>
default GPU voltage on my board seems to be nominally 1.0V, so probably close enough to the to OPP's 1.1V to squeak by
<robmur01>
and more than enough for 600MHz and below
<alyssa>
robmur01: sorry? :innocent:
<urjaman>
oh i thought that was something that only applied to some other board
<urjaman>
not to everything
<urjaman>
(like yes i had read about it here but...)
<urjaman>
(also, how many kernel versions you need to fix setting a voltage...............................)
<robmur01>
urjaman: the default voltage (and thus how likely higher OPPs are to go wrong) is somewhat board-dependent
<robmur01>
Chromebooks seem to hurt the most since they have a different regulator setup to most reference-design-based boards
<robmur01>
as far as I've seen, fixing it has turned out to be really quite fiddly thanks to awkward interaction between the regulator and devfreq APIs, and both devfreq and/or explicit regulators being optional from our PoV
<alyssa>
Erg why is this test failing CI but passing local
<robmur01>
"Continuous Instability"
<alyssa>
>:D
<urjaman>
that also applies to my experience with the kernel development process
<alyssa>
Oh, joy - the behaviour chnages with gles3 exposed
<alyssa>
Okay, I see the problem. But making that test pass still doesn't fix -bterrain
icecream95 has joined #panfrost
<robmur01>
does `echo 300000000 | sudo tee /sys/class/devfreq/ff9a0000.gpu/max_freq` fix it?
<icecream95>
Speaking of things that got broken in the last few kernel releases, the microphone doesn't work anymore on c201 - it tries recording through the speaker instead
<alyssa>
has it worked recently?
<alyssa>
it's been broken on kevin since forever..
<icecream95>
alyssa: I'm pretty sure it was working on 5.3, or at least 5.1
<alyssa>
Neigh
<alyssa>
(have you tried various alsa devices btw?)
<alyssa>
still a bug but maybe a userspace workaround
<icecream95>
I spent a while trying to change stuff in alsamixer, but didn't manage to fix it
<alyssa>
meh
<alyssa>
(also, same here for kevin but I digress)