ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at and - Contact ARM for binary driver support!
<MoeIcenowy> anarsoul: looks like when the PP MMU fault is triggered, the fragment shader has a if control flow
<anarsoul> what's the shader?
<MoeIcenowy> ?
<anarsoul> what's shader source?
<MoeIcenowy> I think you can see it in qapitrace
<MoeIcenowy> because I pointed out which call is failing
<MoeIcenowy> BTW the comments mentioned a file name "st-scroll-view-fade.glsl"
<anarsoul> MoeIcenowy: standalone compiler fails on this shader in NIR validation
<anarsoul> anyway, there's nothing special in it, just some ifs
<MoeIcenowy> anarsoul: why can it fail?!
<anarsoul> MoeIcenowy: standalone compiler? probably someone broke something for standalone
<anarsoul> it fails before it even reaches lima-specific parts
<anarsoul> that's unfortunate but I don't have time to fix it atm
<bshah> btw : re: i am not quite sure if it is same thing or not : but, QML/Qt have similar-ish font rendering bug, which seems to be workarounded by QT_ENABLE_GLYPH_CACHE_WORKAROUND ... potentially useless for teh specific bug though I guess
<bshah> what does mythtv use?
jernej has joined #lima
enunes has joined #lima
<cwabbott> rellla: it's probably different because shader_runner creates a rgba8 surface, which means that each pixel consists of 4 8-bit channels where 0 means "1.0" and 255 means "1.0"
<cwabbott> the blending unit rounds the value you send with gl_FragColor, so the value you see when reading it back will only have 8 bits of precision
<cwabbott> also, the PP uses half-floats which only have 11 bits of precision, which isn't much more
<cwabbott> if I had to guess, it's failing because half floats aren't accurate enough, and implementing distance(a, b) on a scalar as sqrt((a - b)^2) causes extra rounding errors which make the computed difference happen to be closer to 0 when it shouldn't be
<rellla> cwabbott: thanks for the explanation, though i don't understand it all :) i think i have to get the basics about half float and precision first.
<rellla> so the read back value then is also 4 x 8bit?
<cwabbott> it's the 8-bit value divided by 255, so 0 is shown as 0 and 255 as 1.0
<rellla> i understand your last post about the additional rounding errors, but in the posted example the sqrt path is not used, but fabs(fadd(a, -b))
<cwabbott> writing gl_FragColor does the inverse thing, so 1.0 is rounded to 255
<cwabbott> I meant to explain why it would pass if you use sqrt
<cwabbott> using abs(a - b) only has one rounding step (the subtraction) but sqrt((a - b)^2) has three (subtract, then square, then sqrt)
<cwabbott> so the second will always be less accurate (in addition to being slower!)
<rellla> ah ok. so i should try to lower that abs to sqrt also?
<cwabbott> no, it's just not something you can solve
<cwabbott> the test is written expecting that the GPU is using regular 32-bit floats
<rellla> so it's just that piglit doesn't respect half floats in this case or in general at all.
<cwabbott> it's that exposing classic OpenGL on mali-400 at all is a hack
<cwabbott> since desktop GL requires that you use normal 32-bit floats
<rellla> so then should i lower the abs modifier at all? is this an accuracy issue in the (now) succeeding test as well or is it an issue, that some ops can't deal with abs() sources?
<cwabbott> piglit is certainly within its rights to check that something exposing desktop GL calculates its results accurately enough
<cwabbott> and we just can't guarantee that
<cwabbott> no, it's not that it can't deal with abs sources
<cwabbott> replacing abs with sqrt just makes it calculate the difference incorrectly
<cwabbott> well, *more* incorrectly
<cwabbott> which in this cases happens to mean that the calculated difference is smaller, and it passes
<cwabbott> I would check that increasing the tolerance makes it pass, and if that's the case, there's not much we can do
yuq825 has quit [Quit: Leaving.]
jrmuizel has joined #lima
<rellla> enunes, anarsoul: as it's not labeled ~lima, you may have missed that
yuq825 has joined #lima
dddddd has joined #lima
yuq825 has quit [Quit: Leaving.]
<anarsoul> rellla: yeah, sorry
<anarsoul> rellla: you need to tick "Allow commits from members who can merge to the target branch" so I can rebase and merge it
<rellla> oh sorry, done
<cwabbott> anarsoul: btw, the way indirect uniform loads work doesn't have anything to do with registers or register latencies
<cwabbott> there are four address registers, which you can write to directly using some special complex ops
<cwabbott> there is some latency between when the address register gets written and when you can use it, iirc only for stores
<cwabbott> I mean, there's a latency between when you write the address register and when you can use it
<anarsoul> cwabbott: honestly I don't remember what was conversation about :)
<anarsoul> these dmesg-fails look suspicious
<rellla> this is with enunes' piglit patch. increasing the tolerance makes 40 tests pass... though i'll have to prepare a reference test set still
<anarsoul> yeah, something's wrong with textures
<anarsoul> fragment shader is trivial
<anarsoul> if it fails then either texture descriptor is wrong or allocated buffer for texture is too small
<rellla> the dmesg-fail also occur with master 88b8922
<anarsoul> :(
<rellla> (left one)
<enunes> anarsoul: I did notice something weird with the texture descriptor while working on but not sure if it's a bug yet
<anarsoul> wanna look into it?
<anarsoul> enunes: dump it?
<enunes> anarsoul: yeah I did of course
<enunes> basically we skip the first descriptor for some reason, and it stays with unitialized stuff
<anarsoul> :\
<enunes> but my patch to actually use it, caused pp mmu fault
<enunes> so I'm still looking into it
<enunes> today I will have more time to get back into that
<anarsoul> I'm using this small tool to decode descriptors dumped from mali blob:
<anarsoul> enunes: btw I've prepared a pinebook with weston, glmark2 and q3a
<anarsoul> for demoing purpose on XDC :)
<enunes> ah, sounds great
<enunes> any way to set multiple monitors with it to try to actually present from it?
<anarsoul> nah, dual screen won't work. It's either HDMI or LCD. Also it's mini-hdmi
<anarsoul> cwabbott: do I understand correctly that it means if we want to use indirect uniforms load we have to load address register 2 instructions prior to using it?
