herbmilleriw has quit [Remote host closed the connection]
_whitelogger has joined #panfrost
davidlt has joined #panfrost
vstehle has joined #panfrost
<tomeu>
alyssa: fdo just allocates a BO per shader, any reason not to do that, now that we have the BO cache?
chewitt has joined #panfrost
<tomeu>
bbrezillon: are you happy now with how the polygon list work is looking?
<bbrezillon>
tomeu: haven't finished yet
<bbrezillon>
but I'm making progress
<EmilKarlson>
bbrezillon: is bootlin aware that linux-5.3 rockchip-drm is perhaps 11x slower compared to 5.2
<EmilKarlson>
x11perf -tilerect500
<EmilKarlson>
Xorg without glamor
<bbrezillon>
EmilKarlson: I don't know (I no longer work for bootlin :))
<EmilKarlson>
ah ok, thanks
<bbrezillon>
EmilKarlson: what's slower?
<EmilKarlson>
I checked the commit log, it was very bootliny
<EmilKarlson>
mostly virtual desktop change on xmonad
<EmilKarlson>
x11perf -tilerect500 gave me the numbers
<EmilKarlson>
window redraw in general
<EmilKarlson>
rxvt-unicode scrolling or whatever seems to do full window updates
<bbrezillon>
you mean panfrost is slower, right?
<EmilKarlson>
no, this is without anyone making any requests to the gpu afaik
<EmilKarlson>
Xorg without glamor, as mentioned
<bbrezillon>
ok
<bbrezillon>
v5.2 vs v5.3-rc1 ?
<EmilKarlson>
yes, latest comparison with 5.2.5 and 5.3-rc2
<EmilKarlson>
I don't strictly have numbers for -rc1 and v5.2, but subjectively measured slowdown was discussed on #linux-rockhip
<bbrezillon>
you're using the emulated fbdev or the KMS interface?
<EmilKarlson>
I believe kms, I can check later, whatever Xorg selects by default on debian buster
<bbrezillon>
just had a quick look at the commit log
<bbrezillon>
and the only commit that could potentially be harmful is 6c83ca795f2c ("drm/rockchip: Use dirtyfb helper")
<bbrezillon>
you can try reverting that one
<EmilKarlson>
thanks, will do, though have to work for a few hours now
<EmilKarlson>
obviously regressions are not stricly limited to inside rockchip-drm
<EmilKarlson>
for rt2x00 I actually reverted the whole driver to fix regression there
<EmilKarlson>
not sure, if that would work for rockchip-drm
<bbrezillon>
EmilKarlson: well, if there's a perf regression, we want to know where it comes from
<bbrezillon>
reverting the driver to its v5.2 state doesn't help
chewitt has quit [Quit: Adios!]
<EmilKarlson>
you tried already
<EmilKarlson>
or I mean testing revert will help exclude other causes
<bbrezillon>
no, I mean it doesn't help us figuring out which commit is causing that
<bbrezillon>
and no, I haven't tried
<tomeu>
EmilKarlson: what about bisecting?
<EmilKarlson>
well it's about the same thing
<EmilKarlson>
but perhaps at some point
<EmilKarlson>
I mean, whatever helps exclude causes
<EmilKarlson>
if hypothesis is that "only commit that could potentially be harmful is 6c83ca795f2c" in rockchip-drm it means either reverting that commit helps, the regression is outside the driver or reverting driver helps, unless there is compatibility issue
<EmilKarlson>
s/or/and/
<EmilKarlson>
and does not help, whatever
_whitelogger has joined #panfrost
<tomeu>
well, if using git-bisect, you would be bisecting the whole kernel
<tomeu>
guess it could be a change in the clock configuration, DDR, devfreq, etc
<EmilKarlson>
true
<EmilKarlson>
but that's a lot of work on the kernel that has more than one regression per system I tested
<tomeu>
but I get a crash just after the last test and I cannot reproduce locally
<tomeu>
trying now with a debug build, so I get a better backtrace
<alyssa>
tomeu: The problem with a BO-per-shader is twofold
<alyssa>
One is that allocating BOs are expensive and the BO cache can't save the upfront cost, lots of overhead
<alyssa>
Two is that executable memory, IIRC, has some funky alignment reqs in the kernel so you'd be wasting memory and/or fragmenting stuff? But maybe that's not too terrible in practgice
<alyssa>
tomeu: Memory usage reduction is from HEAP, yeah?
JaceAlvejetti has quit [Remote host closed the connection]
JaceAlvejetti has joined #panfrost
davidlt has quit [Ping timeout: 246 seconds]
megi has joined #panfrost
<bbrezillon>
alyssa: regarding the ctx->job field, do you think avoiding the job lookup in the hash table makes a huge difference?
<alyssa>
bbrezillon: Huge? No. But it does get called very frequently and hash lookups aren't free. I've seen it show up as taking some nontrivial time in sysprof but certainly not the bottleneck.
<alyssa>
Not going to make or break anything, but might as well get it right
wens has quit [Ping timeout: 268 seconds]
pH5 has quit [Quit: bye]
<alyssa>
So, working on better uniform allocation
<alyssa>
If I just cap it at 8 registers, it's actually quite a win