<bbrezillon>
if I read it correctly, it says panfrost_mmu_fini() is called from panfrost_job_irq_handler(), which never happens
<bbrezillon>
stack corruption, or are there other cases where the backtrace is unreliable?
<stepri01>
well the quality of the backtrace does depend on compiler options - e.g. inlining can cause the backtrace to be confusing at times. However panfrost_mmu_fini should only get called on driver removal. So this is indeed odd!
<bbrezillon>
yep
icecream95 has quit [Ping timeout: 240 seconds]
<stepri01>
although given that panfrost_mmu_fini is a one line function, I suspect panfrost_mmu_fini+0xd4/0x160 is past the end of the function
<stepri01>
so I guess there's something afterwards which isn't annotated in the binary correctly as a function
<stepri01>
objdump on the vmlinux is probably you're best bet to find out what
<stepri01>
*your (still need to wake up this morning...)
<bbrezillon>
makes sense, I'll try to reproduce locally
chewitt has joined #panfrost
chewitt_ has quit [Read error: Connection reset by peer]
kaspter has quit [Ping timeout: 256 seconds]
kaspter has joined #panfrost
davidlt has quit [Ping timeout: 240 seconds]
stikonas has quit [Remote host closed the connection]
phh has quit [Ping timeout: 265 seconds]
stikonas has joined #panfrost
robmur01 has joined #panfrost
<robmur01>
in a case like that the LR is probably more interesting and useful in terms of clues to how and where the PC went wild
<robmur01>
however it looks a bit dodgy in general that we're apparently taking the job IRQ again while still in the middle of handling a job timeout
<robmur01>
maybe we didn't solve the race conditions as well as we thought? :/
<macc24>
robmur01: what's LR?
<robmur01>
Link Register
<robmur01>
typically, PC shows wacky address A, LR indicates callsite C, you look at C to find it was a call to function B, then try to figure out either what B did wrong, or what broke between B returning and the next call
<robmur01>
what smells most is that panfrost_job_irq_handler() has no stack frame despite the fact that it can't have returned yet
<robmur01>
and that can't be due to inlining since it's the indirect "action->handler()" call
davidlt has joined #panfrost
<robmur01>
so the FP probably got mashed by the same thing that caused a jump off into random code
<robmur01>
I'd be inclined to throw KASAN at it
davidlt has quit [Ping timeout: 240 seconds]
kaspter has quit [Ping timeout: 265 seconds]
kaspter has joined #panfrost
nlhowell has quit [Ping timeout: 256 seconds]
phh has joined #panfrost
<alyssa>
panfrost seems more broken for me than usual rn... :v
<HdkR>
:,<
<macc24>
alyssa: on which device?
<robmur01>
That reminds me, I was seeing some apparently-new glitchiness in FreeCAD the other week, I should see if that's still happening and try to catch it if so
<robmur01>
(IIRC dragging a selection box had a habit of making the entire view go wacky)
karolherbst has quit [Quit: duh 🐧]
karolherbst has joined #panfrost
kaspter has quit [Quit: kaspter]
davidlt has joined #panfrost
<bbrezillon>
alyssa: probably something I broke :-/
<bbrezillon>
any trace/reproducer to share?
phh has quit [Read error: Connection reset by peer]
phh has joined #panfrost
<alyssa>
bbrezillon: I haven't updated mesa on here for a few months :p
<alyssa>
just.. perception ;p
<alyssa>
bbrezillon: weee, passing my first deqp test with new infrastructure... this shouldn't've taken so long :<
<alyssa>
dEQP-GLES2.functional.shaders.operator.binary_operator.div.lowp_int_ivec2_vertex unhappy with me
tomboy64 has quit [Remote host closed the connection]
<alyssa>
bbrezillon: really sorry this is taking so long :(
<bbrezillon>
alyssa: np
<bbrezillon>
I fixed mdg stuff in the meantime :)
<alyssa>
bbrezillon: if I delay any longer you'll probably start a vulkan driver or something 😇
<bbrezillon>
well, I tried...
<bbrezillon>
alyssa: I considered resuming the vk driver, yes :)
<alyssa>
Do, or do not. There is no try.
<alyssa>
;P
<alyssa>
whoops broke log2 lowering
<alyssa>
okay, shaders.operator.* passing
<alyssa>
roadmap then... fix control flow, port over texturing, port over some gles3 stuff I forgot about, fix DCE, add a constant inlining pass, fix misc remaining regressions, and then we should be at parity
kaspter has joined #panfrost
urjaman has quit [Read error: Connection reset by peer]
urjaman has joined #panfrost
tomboy64 has quit [Ping timeout: 240 seconds]
tomboy64 has joined #panfrost
nlhowell has joined #panfrost
kaspter has quit [Quit: kaspter]
camus has joined #panfrost
camus is now known as kaspter
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
tlwoerner has quit [Quit: Leaving]
nlhowell has quit [Ping timeout: 256 seconds]
tlwoerner has joined #panfrost
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
kaspter has quit [Quit: kaspter]
kaspter has joined #panfrost
<alyssa>
control flow done, register spilling partially done
raster has quit [Quit: Gettin' stinky!]
<alyssa>
Guess let's do texturing, will circle back to spilling late
kaspter has quit [Quit: kaspter]
yann has quit [Remote host closed the connection]
<alyssa>
bbrezillon: ugh there is so much TEXC lowering code
popolon has joined #panfrost
* robmur01
tries to be helpful and opens a proper issue instead of just bleating on IRC for once :)
<alyssa>
robmur01: lol
<alyssa>
if it's Midgard, sorry, unsupported too old wontfix
<alyssa>
if it's Bifrost, sorry, too new under active development can't reprioritize issues wontfix
<alyssa>
(and if it's Valhall, we don't have code for that, and if it's Utgard, that's lima's problem not ours)
<alyssa>
🙃
<alyssa>
#triage
<robmur01>
I made the mistake of actually trying to use one of my boards for a real purpose other than just testing if stuff runs :D
<HdkR>
Dog fooding is good ;)
<alyssa>
heh
<robmur01>
TBH now that it can run a usable desktop as well, the RK3399 NanoPC has almost entirely obsoleted the old Core 2 Duo laptop as my main home Linux machine
<anarsoul>
robmur01: it should work mostly fine, pbp has the same soc and works fine for me
<alyssa>
I've been running chromebooks as my main machines since before it was cool ;p
<alyssa>
^^ linux chromebooks
<anarsoul>
(old c2d laptop likely just lacks proper graphics, gma950 was a joke)
<alyssa>
("It was never cool, Alyssa.")
<anarsoul>
cpu-power-wise 13yo c2d is about the same as rk3399
<alyssa>
TBF rk3399 isn't new anymore either
<HdkR>
Exynos 1080 is the new hotness?
<macc24>
well
<alyssa>
macc24: i'm working on it ok! :p
<macc24>
mt8183-based duet almost replaced my desktop
<alyssa>
yup there it is
<macc24>
even if drivers are crashing frequently
<macc24>
robmur01: there are some bugs which can't be fished out by just seeing if it runs ;)
<robmur01>
anarsoul: more than that, NVMe vs 4200RPM makes it no contest ;)
<anarsoul>
robmur01: yeah, definitely
<macc24>
anarsoul: in practical usage, mt8183 probably outperforms my xeon desktop