Space_Man has quit [Remote host closed the connection]
stikonas has quit [Remote host closed the connection]
janrinze has quit [Remote host closed the connection]
megi has quit [Ping timeout: 265 seconds]
vstehle has quit [Ping timeout: 265 seconds]
nerdboy has quit [Ping timeout: 268 seconds]
icecream95 has joined #panfrost
<icecream95>
Of the 236 Xscreensaver demos, 20 fail on load_front_face, 4 kill X because of missing LogicOp support, 5 trigger ReadPixels and run like a glacier and 12 have other issues.
nerdboy has joined #panfrost
<icecream95>
The GL3.3 renderer of GZDoom works with MESA_GL_VERSION_OVERRIDE=3.3 MESA_GLSL_VERSION_OVERRIDE=330 PAN_MESA_DEBUG=deqp, but the framerate is almost as bad as with llvmpipe...
icecream95 has quit [Quit: leaving]
buzzmarshall has quit [Remote host closed the connection]
nerdboy has quit [Ping timeout: 268 seconds]
tlwoerner has quit [Excess Flood]
tlwoerner has joined #panfrost
davidlt_ has joined #panfrost
Ke has joined #panfrost
NeuroScr has quit [Quit: NeuroScr]
vstehle has joined #panfrost
icecream95 has joined #panfrost
nlhowell has quit [Ping timeout: 260 seconds]
<icecream95>
alyssa: A useful perf feature is top: perf top -p `pidof supertuxkart`
<icecream95>
If you hit 'z' the statistics will be zeroed after screen updates so you can see the used functions change in realtime
<tomeu>
icecream95: have you looked at why is gzdoom so slow?
nerdboy has joined #panfrost
davidlt_ is now known as davidlt
<icecream95>
tomeu: When the sky is not visible performance doubles, but I still don't think 18fps is very playable
<icecream95>
With the legacy renderer I get around 100fps
<tomeu>
icecream95: yeah, wonder if the bottleneck is cpu or gpu, for example
<tomeu>
and if cpu, what appears in perf top
<icecream95>
Some of the shaders are massive, with around 220 bundles
<icecream95>
CPU isn't bottlenecked, and resolution doesn't affect it much
<icecream95>
validate_cf_node is at 15% during shader compilation
<HdkR>
vertex shader bounded then?
<tomeu>
nice, that's a better position to be at then :)
<tomeu>
icecream95: would it be easy for you to get a trace with renderdoc?
guillaume_g has joined #panfrost
<icecream95>
Those were fragment shaders, but there are some vertex shaders at 85 bundles
<HdkR>
Vertex shader load doesn't change with resolution, where pixel shader load does, which makes it seem vertex shader :P
<HdkR>
At least seems that way without any other apparent bottleneck
<anarsoul>
HdkR: mali does viewport transformation in vertex shader
<HdkR>
anarsoul: Does that cause GPU load to significantly change with resolution though?
<anarsoul>
nope
<HdkR>
:D
<Ke>
alyssa: what was the solution for you, I found that with a different kernel config I get this same issue?
<Ke>
on Kevin
<Ke>
ie. The device you inserted does not contain ChromeOS
<icecream95>
Side-by-side 3d doesn't impact fps much, so I don't think it's vertex shader bound
<icecream95>
The Arm ASTC encoder now uses the Apache 2.0 license, so now I can send a PR adding support for all the extra block sizes such as 5x12. :P
yann has joined #panfrost
karolherbst has quit [Ping timeout: 272 seconds]
yann has quit [Ping timeout: 268 seconds]
davidlt has quit [Ping timeout: 240 seconds]
icecream95 has quit [Ping timeout: 265 seconds]
karolherbst has joined #panfrost
raster has joined #panfrost
nlhowell has joined #panfrost
Space_Man has joined #panfrost
karolherbst has quit [Quit: duh 🐧]
karolherbst has joined #panfrost
camus1 has joined #panfrost
kaspter has quit [Ping timeout: 265 seconds]
camus1 is now known as kaspter
davidlt has joined #panfrost
<alyssa>
icecream95: "20 fail on load_front_face, 4 kill X because of missing LogicOp support, 5 trigger ReadPixels and run like a glacier and 12"
<alyssa>
Hmm, okay.
<alyssa>
The load_front_face failures are a bug in NIR as discussed. Not something I have time to work on right now but would be easy enough for someone interested to fix that upstream (and I'd be happy to review etc)
<alyssa>
logic ops.... similar story, except that's a missing Panfrost feature. Again, I'd be happy to help but they're very much legacy and I'm afraid at the moment I'm not able to prioritize that.
<alyssa>
(The trick is to lower them into blend shaders, the same way we lower exotic blend modes. v3d does something very similar.)
<alyssa>
thank you for the perf trick!
<alyssa>
tomeu: At some point we really should revisit perf counters, now that the CPU side bottlenecks aren't so bad on some workloads... IIRC glmark is GPU bound at this point on most tests, I'd be curious where the bottleneck is / if these are just architectural limits.
<alyssa>
Ke: The parittion has to be blessed properly
<Ke>
ah, not my issue then
<Ke>
thanks
<alyssa>
why do i have thre notebooks on my table and none the one with my notes
<alyssa>
Regardless, the diagnostic was printing `cgpt show [device]`
<alyssa>
and the success/tries/priority were zero
<alyssa>
which you can fix with `cgpt add -i [foo] .....`
<Ke>
yup, the partitioning scheme was the same for me for the working and non-working case
<alyssa>
Alright, dunno then :-(
<Ke>
as far as I know also the FIT format was the same, difference would be somehow in the kernel image
<alyssa>
icecream95: (ASTC rewrite) woohoo!
nlhowell has quit [Ping timeout: 260 seconds]
megi has joined #panfrost
<alyssa>
Meanwhile texture_subdata MR is broken nd it's not at all obvious why
<tomeu>
alyssa: regarding these uncached reads, due to bitfields I wonder if it won't be better to do what we mostly do: stage on the stack then memcpy
<alyssa>
tomeu: That's absolutely better, yes.
<alyssa>
But eliminating some of the high impact bitfields is probably a good idea anyhow.
<tomeu>
as you can see in the cmdstream file in that branch, it's quite tedious to list all fields
<tomeu>
and looks like the hw checks a lot of fields even if they are not used
<alyssa>
Mm
<alyssa>
There's probably a happy medium.
<alyssa>
And it probably will have to be on a case-by-case basis... For the job_descriptor_headers, for instance, I'm quite pleased with the solution in the new scoreboarding MR, I don't think we'll get any better than that
<alyssa>
Unfortunately, as I'm sure you're realizing, the original Gallium cmdstream was *extremely* ad hoc
<alyssa>
You weren't there for the old days, but when we started out, we were just taking pandecode traces as full C structures and trying to replay them against kbase. And when that worked, trying to parametrize things (the trans library). And then importing Gallium headers to do a fake Gallium stub. And then cp'ing that into a copy of softpipe...
<alyssa>
Was really only when you came in that it turned into a real driver ;)
<alyssa>
...awkwardly, quite a bit of that file is still in mesa master in bits and pieces...
<tomeu>
we should try that again! :p
Space_Man has left #panfrost ["Konversation terminated!"]
mixfix41 has left #panfrost [#panfrost]
<alyssa>
:D
yann has joined #panfrost
<alyssa>
I am wondering about this random dirty tracking.
<alyssa>
It's been there since the beginning since 'everyone else did it'
<alyssa>
It really doesn't make sense for Mali though, does it.
<alyssa>
because descriptors.
<alyssa>
panfrost_invalidate_frame is definitely wrong.
<alyssa>
(MRs opened)
yann has quit [Ping timeout: 240 seconds]
<tomeu>
cool!
yann has joined #panfrost
yann has quit [Ping timeout: 240 seconds]
* alyssa
seeing considerable overhead in `el0_svc_common.constprop.0`
<alyssa>
or maybe in `arch_local_irq_enable`
<alyssa>
looks to be panfrost unrelated
<alyssa>
(looking at perf from GNOME, seeing if it can find anything that sysprof couldn't)
<robmur01>
oh, arch_local_irq_enable() is the absolute worst
<alyssa>
robmur01: eh?
<robmur01>
someone needs to optimise the hell out of that
<robmur01>
:P
<alyssa>
?
<alyssa>
yes i live in canada we get it move along
<alyssa>
Next up I think will be moving texture emit paths to the CSO which I think they already were but then AFBC got half-landed and things got confused.
<robmur01>
(the joke is that we don't have an NMI for sampling, so unmasking IRQs is always a 'hotspot')
<alyssa>
Ah.
* alyssa
should really clean this up a lot more, okay
<thecycoone>
beautiful... either my homeserver or my friends irc bridge misconfigured. I'll throw it somewhere else.
<urjaman>
but yeah not exactly surprised, i also get occasional glitches with firefox and some sites ... like weirdly scrolling through pictures in a twitter post will sometimes just flicker weird stuff
<thecycoone>
In lxqt I was hitting worse issues - gray tiles all over the place, and periodic system freeze. I don't have anything left from that though.
<alyssa>
:\
<urjaman>
... that's worse than what i remember seeing ... atleast lately
<alyssa>
dmesg?
<alyssa>
Obviously that's some sort of fault, I'd be curious which ones
<alyssa>
if it's page faults, there's also a chance that's new-kernel-related (there was a regression ... I know there were fixes floating around, but I'm not sure if they've hit your kernel or not)
<urjaman>
i think the dont deallocate BOs fix just got in in 5.6-rc2
<urjaman>
dunno in what stables it is tho
<alyssa>
Mm, it looks like the released 5.5 has it but not 5.5rc5, judging by 9afdcd64f2c96f3fcc1a28912987f2e8066aa995
<alyssa>
Wonder when that'll trickle down to debian
<thecycoone>
yeah, the last time it happened it was during an update and I had to restore the filesystem from a backup so I don't have the messages and a little gun shy of trying to reproduce.
<thecycoone>
thus the switch to sway
<urjaman>
i noticed because linus mentioned panfrost by name in the 5.6-rc2 announcement :P ... and then by a grep there was one panfrost patch
<alyssa>
urjaman: he knows we exist \o/
<thecycoone>
but... anything for a good cause I guess... let's see
<urjaman>
alyssa: yay i suppose o/ :P
<urjaman>
thecycoone: i think alyssa was asking for a dmesg of the current firefox glitch, atleast for a start
<thecycoone>
oh
<thecycoone>
well I have the gray tiles right away. I'll send that
<thecycoone>
and seems to happen in lxqt when firefox and konsole are both open.
<urjaman>
.... that's.... impressively bad
<alyssa>
urjaman: that's a scheduler bug, was fixed in uh
<alyssa>
4af8d5b0645bd96ed71691811e07c01b52af6094
<alyssa>
from Jan 18
<urjaman>
yeah i havent updated mesa in a while
<alyssa>
mesa has gotten a lot better in the past months
<alyssa>
GNOME's now my daily driver for one :)
<thecycoone>
yeah, the gray tiles are fun
<thecycoone>
they dance around a bit
<thecycoone>
go away, come back
<urjaman>
i first parsed that as a command :P
Elpaulo has joined #panfrost
<thecycoone>
:D
<urjaman>
"leave, no wait come back"
<urjaman>
but yeah maybe i should try a mesa update when i'm back on the C201 (now that i'm thinking about it i just left it at the last build (that was good) after that bisect i did a bit over month ago so...)
<alyssa>
Ahhhhhh that's how you do it good!!@
<alyssa>
Stick the texture_descript----- no wait nvm
<alyssa>
I thought i had it whoops.
<urjaman>
... what :D
<urjaman>
bummer.
<alyssa>
An idea too clever for my own good, that's what.
<alyssa>
Still, I don't think it's a bad idea to reserve a BO for the texture descriptor
<alyssa>
It's wasteful in memory (since BOs are 4kb minimum) but then it means 0 draw time overhead
<alyssa>
Of course we could also meet in the middle and reserve it on the heap and then do a memcpy() at runtime. Either's probably fine
<alyssa>
Regardless we can push the entire thing into CSO create time, and then just regenerate in the event that there's a layout switch (which is rare, happening at most once in the lifetime of a resource -- and currently never unless you set PAN_MESA_DEBUG=afbc)
karolherbst has quit [Ping timeout: 240 seconds]
karolherbst has joined #panfrost
stikonas has joined #panfrost
yann has quit [Ping timeout: 265 seconds]
<alyssa>
starting to regret my life choices am i refactoring right
<alyssa>
tomeu: I'm moving texture generation code into root panfrost to force a major cleanup in line with the immutable cmdstream push
<alyssa>
I don't doubt that it's worth it but oh my gosh there is so much ad hoc complexity here.
pH5 has quit [Quit: bye]
pH5 has joined #panfrost
<alyssa>
Getting quite close to taming it, though :)
<alyssa>
Basically ending up with a panfrost_new_texture() constructor which builds a mali_texture_descriptor including all the pointers/strides after, all in one
<alyssa>
And just spits back that result
<alyssa>
and uh that's in root panfrost which forces me to do clean code
<alyssa>
I feel like I'm untangling a massive knot/
<alyssa>
At any rate this will be good for ES3 support
raster has quit [Quit: Gettin' stinky!]
<alyssa>
I'm almost there. I think.
<alyssa>
It compiles... now to actually do the Gallium part.
guillaume_g has quit [Quit: Konversation terminated!]
davidlt has quit [Ping timeout: 265 seconds]
warpme_ has quit [Quit: Connection closed for inactivity]