ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at and - Contact ARM for binary driver support!
<alyssa_> anarsoul: random drive by comment but `texture2D(recttexture, v_coord)` can be lowered to texture2D(recttexture, rect_coord)` where `rect_coord = v_coord * scale_factor` as computed in the VS, hence varying high-precision path?
yuq825 has joined #lima
<anarsoul> alyssa_: that's possible but difficult and can be only done when fragment and vertex shader are linked
<anarsoul> alyssa_: it's way easier to add support for it into cogl
<MoeIcenowy> anarsoul: so pp does fp32 on varying, but fp16 on numbers native to pp ?
<anarsoul> MoeIcenowy: not exactly
<anarsoul> MoeIcenowy: if we load varying directly (not from register) into discard register (reg 15) to use it in sampler then it's high precision
<anarsoul> if we load varying into register and then use register (through another load varying op) in sampler then it's fp16
<anarsoul> basically that's the only exception for pp
<anarsoul> for everything else it uses fp16
* rellla triggered piglit with cubemap branch now
<plaes> o_O
<Tofe> ah, great, thanks!
rellla has joined #lima
rellla has joined #lima
rellla has joined #lima
enunes has quit [Ping timeout: 240 seconds]
enunes has joined #lima
rellla has quit [Remote host closed the connection]
jrmuizel has quit [Remote host closed the connection]
rellla has joined #lima
<rellla> so here is the piglit result for the cubemaps branch:
hoijui has quit [Ping timeout: 250 seconds]
jrmuizel has joined #lima
<MoeIcenowy> anarsoul: surfaceless deqp still doesn't run here
<MoeIcenowy> anarsoul: it just acquires EGLDisplay by eglGetDisplay(NULL) ...
<MoeIcenowy> looks ridiculous
<anarsoul> :(
armessia has joined #lima
<rellla> seems there are some bits missing still :)
<rellla> anarsoul: btw ppir disasm is wrong/incomplete...
<anarsoul> why?
<anarsoul> is there anything it fails to disassemble?
<armessia> rellla: hi, tnx a lot for running my branch through piglit!
<armessia> Some bits seem to be missing still indeed :) Will look into it
<rellla> anarsoul: the different perspective cases are missing, when source_type is 2
<anarsoul> oh, OK
<rellla> so everything is reported as gl_FragCoord
<anarsoul> fix it? :)
<rellla> i fixed it already and can prepare a MR later.
<anarsoul> cool, thanks
<MoeIcenowy> anarsoul: BTW I think gl_FragDepth cannot be faked, right?
<anarsoul> nope
<anarsoul> hardware doesn't seem to support it
<MoeIcenowy> then OpenRA won't show Tiberium
<anarsoul> :(
<MoeIcenowy> and I don't know how to fix it
<anarsoul> ask OpenRA guys
<MoeIcenowy> but BTW we should get X to work first
<anarsoul> it does work
<anarsoul> but it's not optimal yet
<MoeIcenowy> btw I'm still thinking how to debug dead gp error 0x400000
<MoeIcenowy> apitrace doesn't work on non-X11, at least not retraceable
<anarsoul> it works on wayland
<MoeIcenowy> but I think no gbm?
<anarsoul> no idea
<anarsoul> anyway, we should extend kernel driver to capture the job that caused error and expose it through debugfs
<anarsoul> something similar to i915_error_state
<MoeIcenowy> interesting
<MoeIcenowy> it has support for waffle
<Tofe> anarsoul: I might be wrong, but it doesn't look like an assert was raised here:
<anarsoul> weird
<anarsoul> you may want to retry with updated branch that Connor pushed today
<anarsoul> there's some fixes
<Tofe> will do, thanks
<Tofe> nope, same thing
<Tofe> I'll try to debug that a bit further tomorrow; I'll probably be able to catch the moment when the "const" op node is created...
<armessia> rellla: tnx for the hint on the assertion failure
<armessia> Now glsl-fs-texturecube and glsl-fs-texturecube-2 pass :-)
<anarsoul> armessia: btw you also need to handle tiled case for cubemaps
<anarsoul> and mipmapping
<armessia> anarsoul: I already put should_tile to through at some point and it still seemed to work, will double check though, also with piglit
<armessia> Didn't test mipmapping yet
<anarsoul> armessia: do you align cube faces at tile boundary?
<anarsoul> (you probably need it)
<armessia> anarsoul: would have to check if the cube faces are properly aligned, I didn't have to change anything so far in lima_resource, depth handling was already present
<anarsoul> OK
<armessia> Force pushed my branch again. Did some clean-up of debugging messages and got rid of the extra ppir_op
<armessia> What piglit test set do you guys run? If I run the gpu set I have more than 26000 tests, the test set in the reports of rellla seems much smaller..
<armessia> Most of them are skipped and some get stuck
<armessia> anarsoul: tnx!
<enunes> anarsoul: so are you working on debugging the gp error with some xorg applications?
<anarsoul> enunes: nope
<enunes> I tried some of the mesa-demos and I can easily reproduce a gp error with applications that send a too large plbu buffer
<enunes> not sure if it's the same one, "state=" is different
<enunes> it's also different than the vertex limitation as this is due to too many separate gl calls, e.g. several glDrawArrays with small buffers
<anarsoul> i.e. we have to flush job when it reaches certain size?
<anarsoul> I'm not sure if it's the same though
<enunes> yeah I did that and it fixes it
<anarsoul> could be
<enunes> I'll check how the blob does it
<enunes> if it's by inserting flushes or separating buffers
<anarsoul> btw, while you're here, cwabbott proposed to talk about gpir compiler (you probably saw the email). How we're going to split the slot?
<anarsoul> 15 mins each?
<enunes> I think 45 includes time for q&a
<enunes> so it's less than that
<enunes> but sure it will be great if cwabbott can speak about gpir
<anarsoul> not sure if we'll get a lot of questions
<enunes> I'll share the slides and reply there
<anarsoul> might as well use the time for talk
<anarsoul> should we do small demo? :)
<anarsoul> after all q3a seems to work with gl1 renderer
<enunes> demo would probably be interesting, I can bring my pinebook, unless you plan to bring yours, which is geographically closer :)
<anarsoul> I can bring my pinebook
<anarsoul> too bad dual screen doesn't work well...
<enunes> I tried it once and it worked, but I guess that was with software renderer
<anarsoul> enunes: and technically it's not closer. I'm 4k km away from Montreal, probably the same for you
<anarsoul> just checked, it is but not by much - 4910km vs 6381km :)
<enunes> although if I understand correctly from the reports, meson has a slightly less buggy experience than allwinner now?
<anarsoul> I'm not sure
<anarsoul> meson usually has mali450
<enunes> the cursor plane stuff
<enunes> I haven't tried myself
<anarsoul> they don't have it
<anarsoul> IIRC megi has a patch to mark one of UI planes as cursor
<anarsoul> should work fine on pinebook
<anarsoul> not on pine64 though since HDMI has only 1 UI plane
<anarsoul> enunes: can you check whether your fix fixes sway?
<enunes> I wouldn't call it fix for now, but I can give it a try
<enunes> apparently doesnt fix it, I can open sway and move mouse around but then it freezes with gp error
<anarsoul> so that's something different :(
<anarsoul> I wish we had documentation on hardware
<anarsoul> it would make it so much easier
<enunes> the gp error is also different, the one I see with the plbu was just "4"
<enunes> which makes sense as it's plbu too large or something
<anarsoul> is it 0x400000?
<enunes> with sway yes
<anarsoul> btw, if you get LIMA_GP_IRQ_PLBU_OUT_OF_MEM the fix would be to increase tile heap size
<anarsoul> 0x400000 is LIMA_GP_IRQ_PTR_ARRAY_OUT_OF_BOUNDS and I have no idea what it means
<anarsoul> I wonder what array it refers to
<enunes> I think for plbu we allocate a dynamic buffer?
<enunes> in user space
<anarsoul> enunes: I'm pretty sure that PLBU_OUT_OF_MEM is used to signal that it ran out of heap. ARM kernel driver uses it to grow heap and resume the job
<anarsoul> and in lima we allocate 1mb for heap
<enunes> unless I misunderstood we use the dynarray stuff to assemble it and then move it to u_upload_alloc which is resizable
<anarsoul> oh, you referring to PTR_ARRAY_OUT_OF_BOUNDS
<anarsoul> then see ARM kernel driver
<anarsoul> also you can easily verify if it ran out of heap, just increase heap size in lima
<anarsoul> change gp_tile_heap_size to 2mb
<enunes> I tried it now, even at 8mb the example triggers the gp error
<enunes> maybe it's a good stress, let me try more
<anarsoul> and it's 0x4?
<anarsoul> that's weird
<enunes> yeah at 64mb I don't trigger it anymore
<anarsoul> that's a lot
<anarsoul> guess we can put this test aside for now?
<anarsoul> since we have no way to predict plbu heap size
<enunes> sure, I was just running those demos to see if something interesting shows up
<anarsoul> unless you're interested in implementing heap bo in kernel driver :)
<anarsoul> IIRC panfrost folks already have it
<anarsoul> probably not merged yet
<enunes> I think the 0x400000 error is more interesting to track down first
<anarsoul> definitely
<anarsoul> it's a show stopper
<enunes> I just wish we had a simpler reproducer for it
<anarsoul> sway is pretty simple
<anarsoul> IIRC it doesn't use any fancy shaders
<enunes> indeed, I got an apitrace with a few frames from it and I can reproduce it
<anarsoul> I wonder if the fix will be another 1-liner :)
jrmuizel has joined #lima