alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
stikonas has quit [Remote host closed the connection]
<alyssa> Ouch, figured out what the draw_buffers fails are about.. affects all the way back to ES3.0
<alyssa> will fix tomorrow
vstehle has quit [Ping timeout: 252 seconds]
atler has quit [Killed (rothfuss.freenode.net (Nickname regained by services))]
atler has joined #panfrost
* icecream95 is hitting a stack overflow in Firefox
<icecream95> #250 0x0000ffffde35fec8 in panfrost_batch_submit
<alyssa> let me guess, blitter recursion?
<icecream95> Nope, just a *really* long dependency chain
<alyssa> blink
<icecream95> Splitting panfrost_batch_submit so that recursing through dependencies happens in a different function to actually submitting seems to help
kaspter has quit [Ping timeout: 240 seconds]
kaspter has joined #panfrost
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #panfrost
jernej has quit [Client Quit]
jernej has joined #panfrost
<icecream95> The function used about 2 KB of stack space, times 250 levels of recursion to be 500KB of stack, but the stack is only 256 KB big
<icecream95> ("Shouldn't it have segfaulted long before 250 levels if the stack was that small?" "Uhh...")
<HdkR> icecream95: Automatic stack growing
<icecream95> Answer: I was using the stack space for a different build of Mesa, in the instance where it recursed 250 times it used only 1 KB of stack
kaspter has quit [Ping timeout: 252 seconds]
kaspter has joined #panfrost
kaspter has quit [Ping timeout: 240 seconds]
<icecream95> I don't remember Xorg being this broken before--even glxgears crashes with MALI_BIFROST_TILER_pack: Assertion `values->fb_width >= 1' failed
kaspter has joined #panfrost
kaspter has quit [Excess Flood]
kaspter has joined #panfrost
kaspter has quit [Ping timeout: 240 seconds]
kaspter has joined #panfrost
kaspter has quit [Quit: kaspter]
kaspter has joined #panfrost
davidlt has joined #panfrost
camus has joined #panfrost
camus1 has joined #panfrost
kaspter has quit [Ping timeout: 265 seconds]
camus1 is now known as kaspter
camus has quit [Ping timeout: 252 seconds]
kaspter has quit [Read error: Connection reset by peer]
kaspter has joined #panfrost
kaspter has quit [Ping timeout: 252 seconds]
kaspter has joined #panfrost
<icecream95> The glxgears crashes start are caused by one of the commits in cfe9bca9120..9d0ad7fd2e1
<alyssa> are those AFBC
<icecream95> alyssa: No, pan_image stuff
<alyssa> Oh.
* icecream95 sees !10415 fixes a commit in that range
<alyssa> that only affects es31 afaik
<alyssa> (the fix i mean)
<icecream95> All the commits in that range cause Xorg to give DATA_INVALID_FAULTs, making further bisecting difficult
<alyssa> bbrezillon: ^^^^
<icecream95> erm, efcb1e494b7..9d0ad7fd2e1
<icecream95> The bad commit is 9d0ad7fd2e1 ("panfrost: Patch the gallium driver to use pan_image_layout_init()") itself
<alyssa> lovely
<icecream95> If I use 9d0ad7fd2e1 for Xorg and 051d62cf041 for glxgears then it gives BadAlloc errors from X
* icecream95 wonders if it's a good idea to set breakpoints on a running Xorg instance
<alyssa> no.
* icecream95 boots speedy to SSH in and kill Xorg
vstehle has joined #panfrost
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost
<icecream95> It seems the problem is just that pan_image_layout_init is returning false because line_stride & 63 != 0
<icecream95> If I remove that if statement then everything seems to work fine
davidlt has quit [Remote host closed the connection]
davidlt has joined #panfrost
cowsay has joined #panfrost
cowsay_ has quit [Ping timeout: 265 seconds]
<bbrezillon> icecream95: hm, I remember having issues when the stride was not 64B aligned on Bifrost
<bbrezillon> but maybe that was not on linear buffers
<icecream95> bbrezillon: If there are issues for non-aligned strides (I didn't see any on G72) then panfrost_create_scanout_res would have to align the width to make the stride aligned
<bbrezillon> just to be sure, this problem happens when you import a buffer, right?
<icecream95> yes
<bbrezillon> then the align() done here is also problematic => https://gitlab.freedesktop.org/mesa/mesa/-/blob/master/src/panfrost/lib/pan_texture.c#L633
<bbrezillon> and if (explicit_layout->line_stride < stride) might fail too
<bbrezillon> something like that => https://gitlab.freedesktop.org/-/snippets/1912
<icecream95> bbrezillon: That works, except you forgot to remove the old ALIGN_POT
camus has joined #panfrost
kaspter has quit [Ping timeout: 268 seconds]
camus is now known as kaspter
raster has joined #panfrost
warpme_ has joined #panfrost
stikonas has joined #panfrost
<wicast> hey guys, I finally able to run vkcube. But I'm not sure why it stays stationary and waits a fence that already finished.
<HdkR> vkcube with...which driver?
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<wicast> panvk, in weston/wayland
<HdkR> Time to dive in to the source and figure out why it is sleeping on a fence. It isn't upstreamed for a reason you know :)
<bbrezillon> wicast: weird, I don't see that here
<bbrezillon> could it be a KMS fence you're blocked on?
<wicast> strace stay at ioctl(DRM_IOCTL_SYNCOBJ_WAIT...)
<wicast> I've tested with glmark2-es-wayland. Hardware should be good.
<wicast> oh, I get some error from dmesg
<wicast> [ 4323.980226] panfrost ffe40000.gpu: js fault, js=1, status=OUT_OF_MEMORY, head=0x40c3000, tail=0x40c3000
<wicast> [ 4323.984011] panfrost ffe40000.gpu: gpu sched timeout, js=1, config=0x7300, status=0x60, head=0x40c3000, tail=0x40c3000, sched_job=000000001fc9d5eb
<wicast> [ 4324.503657] panfrost ffe40000.gpu: gpu sched timeout, js=0, config=0x7300, status=0x8, head=0x40c33c0, tail=0x40c33c0, sched_job=00000000e9ba870f
<wicast> Same after I reboot(
<wicast> [ 3.982041] panfrost ffe40000.gpu: dev_pm_opp_set_regulators: no regulator (mali) found: -19
<wicast> Is this a known error? I have seen this somewhere else.
<bbrezillon> wicast: which kernel are you using?
<wicast> manjaro-arm
<wicast> 5.11.7-1
<bbrezillon> it should work just fine :-/
<bbrezillon> I mean, I don't see these faults here
<wicast> Just gussing, my sdcard reader is really unstable, could this cause some glitches to the kernel
<wicast> it can fail to boot, sometime.
<bbrezillon> nope
<bbrezillon> the kernel is loaded in RAM at boot time
<bbrezillon> wicast: this is a G52, right?
<wicast> yes G52
<wicast> vim3 4G
<wicast> but I can't boot any more after I tried to reinstall the kernel(
<wicast> filesystem crouped
<wicast> I need to buy an ssd first...
pendingchaos_ is now known as pendingchaos
<shadeslayer> alyssa: bbrezillon Plasma seems to want to disable the blur effect https://invent.kde.org/plasma/kwin/-/merge_requests/883 on panfrost, do you think you have cycles to spare to look into optimizing the effect?
apol has joined #panfrost
<bbrezillon> shadeslayer: not immediately, but maybe you could open an issue in mesa (ideally with a trace we can replay)
<shadeslayer> apol: ^^
<shadeslayer> bbrezillon: trace might be tricky though, let's see
<macc24> i remember issue about blur stuff on kwin
stikonas has quit [Remote host closed the connection]
<macc24> #4687
stikonas has joined #panfrost
<raster> shadeslayer: realistically the blur needs to be isiolated into a test of its own
<raster> eg just an app that uses the same algorithm to blue some image
<raster> before i bother with that i'm going ot see if plasma works with the mali ddk and if they seem to be comparable in perf (ddk and panfrost)
stikonas has quit [Remote host closed the connection]
<raster> and dang task-kde wants to install 1.6g of stuff... poor emmc...
<tomeu> raster: will be interesting to know, thanks!
<raster> tomeu: i was just re-setting upo my dual ddk vs panfrost thing after it bitrotted a bit
<raster> evas in efl also has a blur filter that uses the gpu too - i havent actually profiled it but its far easier to cook up a demo of that :)
<raster> but if i see a big difference between ddk and panfrost it's worth making a dedicated test case
<raster> it's certainly an interesting shader use case thats real-life
<raster> ok... now.. how do i get this blur working?
<tomeu> I think qml is different from weston in that it uses fbos
urjaman has quit [Read error: Connection reset by peer]
urjaman has joined #panfrost
<raster> arg
<raster> this is a bug fest
<raster> cant even use menus properly
<tomeu> I kind of remember others at collabora working on getting plasma work well with etnaviv or something like that
<tomeu> daniels: do you remember any details on that?
<tomeu> there was also something about qt not supporting wl_dmabuf or so
<daniels> QML, not KWin/Plasma
<raster> settings crashes the desktop...
<raster> hmm
<daniels> but yeah, QML is pretty FBO-happy, copies all over the place
<tomeu> but doesn't plasma use qml?
<raster> but wouldnt you use fbo's anyway to do your initial downscale
<daniels> tomeu: probably
<daniels> yep
<raster> well i would use fbo's for the intermediate buffers - it's kind of a necessary for a blur :)
<raster> :(
<tomeu> maybe panfrost needs to learn some new tricks so that those copies can happen with less bw?
<raster> i cant manage to enable blur... opening setting crashes the desktop... compositor itself seems to stay alive - not sure how they structured it but looks like desktop is a wl client
<raster> uh oh... well i guess this isnt going to work... plasma doesnt start with ddk - just a black screen
<raster> weston and enlightenment are all happy with both ddk and panfrtost
<raster> :|
<raster> well let me quick and dirty use my blur alt-tab effect
camus has joined #panfrost
kaspter has quit [Ping timeout: 252 seconds]
camus is now known as kaspter
<daniels> raster: by the QML thing, I don't mean 'FBOs are a problem', I mean 'QML doesn't see through render chains and will insert totally unnecessary intermediate FBOs when you could just draw from A->B instead'
<alyssa> that sounds like a them problem
<raster> daniels: oh... ugh.
cphealy has quit [Remote host closed the connection]
<tomeu> alyssa: well, if we can learn about something that could improve panfrost...
<raster> fan-friggin-tastic
<raster> installing kde invovled a libwayland upgrade and this now has segv's inside libwayland-server ... yay
cphealy has joined #panfrost
Elpaulo has joined #panfrost
<alyssa> ...
<raster> wat?
<raster> ==143342== Jump to the invalid address stated on the next line
<raster> ==143342== at 0x0: ???
<raster> ==143342== Address 0x0 is not stack'd, malloc'd or (recently) free'd
<raster> inside wl_signal_emit()
<raster> #0 0x0000000000000000 in ()
<raster> #1 0x0000000000265024 in wl_signal_emit (data=0x43e37610, signal=0x43e37618) at /usr/include/wayland-server-core.h:478
<raster> #2 _e_comp_wl_buffer_cb_destroy (listener=0x43e37628, data=<optimized out>) at ../src/bin/e_comp_wl.c:999
<raster> ughh...
<raster> this is one of those days where to look art thing A i have to fix a chain of bugs over at B ...
<raster> B will need to fix C then D .. then eventually i can go back to A... hooray
kaspter has quit [Quit: kaspter]
<raster> this seems to be a change in libwayland where durting client destory i cant call signals registered...
<raster> or well the state is nulled out.. hmm
<alyssa> this is so broken
<alyssa> so many layers of wrong
<raster> what is broken?
<raster> because something weird is going on inside libwayland now...
<raster> here's the funtimes...
rcf has quit [Quit: WeeChat 3.2-dev]
<raster> the exact same ptr to the same memory struct (wl_listener *)
<raster> in the parent frame notify is a valid ptr
<raster> in the child... it's not.
WoC has quit [Remote host closed the connection]
<raster> the list has only a single node...
<raster> yargh
<alyssa> woof woof
<daniels> raster: are you trying to remove a signal handler from a signal handler? because that’s guaranteed corruption
<daniels> you cannot change the list during a walk
<raster> daniels: actually just calling the signal handler that was already registered
<raster> so the signal handler is just registered in the buffer destroy_signal (a wl_signal)
<raster> that wl_signal is in our own data structs (that is our buffer wrapper which tracks a bunch of things)
<raster> something nulled out the notify
<raster> (notify cb ptr)
rcf has joined #panfrost
stikonas has joined #panfrost
<raster> oh now
<raster> heisenbug
<raster> i now compile libwayland with -O0 and i no longer have crashes.. it's all fine... wtf...
<raster> oh wqait
<raster> no - i'm not using my compile libwayland anytmore - using the system pkgs again which first started the crashes...
<raster> argh
<raster> i hate heisenbugs!
<raster> alyssa: bad news - in performance ticket :|
<alyssa> ruh roh
WoC has joined #panfrost
davidlt has quit [Ping timeout: 268 seconds]
macc24 has quit [Ping timeout: 250 seconds]
<daniels> raster: it sounds like your nested structure is invalidating assumptions
macc24 has joined #panfrost
* alyssa forgets how differential equatios work
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost
stikonas has quit [Remote host closed the connection]
kherbst has quit [Ping timeout: 260 seconds]
stikonas has joined #panfrost
karolherbst has joined #panfrost
<cphealy> Does Panfrost support YUV render targets with Mali GPUs that support this?
warpme_ has quit [Quit: Connection closed for inactivity]
<raster> daniels: the callback is stored in a list of listeners attached to our datastruct. this is really weird... but now the bug wetn away magically after i rebooted...
<raster> ¯\_(ツ)_/¯
neonking has quit [Remote host closed the connection]