ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!
<anarsoul|2> haha
<anarsoul|2> enunes: we're hitting exactly the same issue in gpir that you've seen in ppir regalloc
<anarsoul|2> so r6 is technically a dead write (not used anywhere else)
<anarsoul|2> so it doesn't conflict with anything
<enunes> anarsoul|2: hah nice, is it also because of breaking the 'vec4' operator into scalar writes?
<anarsoul|2> nope
<anarsoul|2> it's because we're avoiding movs in gpir when translating from nir
<enunes> that was the main source of them for me that werent eliminated by nir dce
<anarsoul|2> see gpir_emit_alu()
<anarsoul|2> it's not dead in nir :)
<anarsoul|2> I guess I know how to fix it, we need pass that eliminates dead writes in addition to dce pass
<enunes> in gpir or nir?
<anarsoul|2> gpir
<anarsoul|2> yep, fixed now.
<anarsoul|2> will submit MR later today
yuq825 has joined #lima
_whitelogger has joined #lima
yuq825 has quit [Ping timeout: 240 seconds]
yuq825 has joined #lima
buzzmarshall has quit [Remote host closed the connection]
yuq825 has quit [Ping timeout: 258 seconds]
<Marex> anarsoul|2: hey so uh, on zynqmp, I got lima going with latest mesa and linux 5.4, but if I run GALLIUM_HUD="cpu,fps" kmscube, I see some really weird image corruption
<Marex> anarsoul|2: without GALLIUM_HUD, it works fine
<Marex> something that's known ?
<anarsoul|2> Marex: any pictures?
<Marex> anarsoul|2: is there a way to dump drm framebuffer ?
<anarsoul|2> don't tell me your phone can't make pictures :)
<Marex> gotta remove privacytape first, hold on
<Marex> anarsoul|2: but really, isn't there anything which can dump current frame from DRM device ?
<anarsoul|2> Marex: sorry, no idea. I have X11 and wayland (weston, sway) working so I don't really need to dump DRM buffer
<anarsoul|2> Marex: yeah, I see the same issue here
<Marex> anarsoul|2: ok good
<anarsoul|2> btw gallium hud works well in wayland
<anarsoul|2> i.e. for glmark2-es2-wayland
<anarsoul|2> but not for -drm
yuq825 has joined #lima
<marex-cloud> anarsoul|2: different pixel format maybe?
<anarsoul|2> both should be 32-bit
<marex-cloud> Maybe something reports some 16bit format somewhere
<marex-cloud> WL sure can be convinced to use RGB565 instead of RGBA8888
megi has quit [Quit: WeeChat 2.7.1]
megi has joined #lima
yuq825 has quit [Remote host closed the connection]
Barada has joined #lima
chewitt has quit [Quit: Zzz..]
<anarsoul|2> marex-cloud: maybe
kaspter has quit [Quit: kaspter]
kaspter has joined #lima
<rellla> anarsoul: nice respin.
<anarsoul|2> thanks
<rellla> seems the remaining issues are mostly tests, where ints are involved in the shader ...
<anarsoul|2> I'll look into some short failure
<anarsoul|2> also there're still shaders where scheduler fails
<anarsoul|2> those that were causing OOM
<anarsoul|2> heh, our disassembler can't handle this shader compiled by blob: https://gist.github.com/anarsoul/bc07c4263d4ea00738a88bec2f3be597
<anarsoul|2> note mul.m0 ^20 ^7 ^13
<anarsoul|2> but we don't have ^7 and ^13
<anarsoul|2> ^7 is acc0 output of instr 001
<anarsoul|2> ^13 is acc1 output of instr 002
<anarsoul|2> ah, got it.
yann has quit [Ping timeout: 255 seconds]
<anarsoul|2> fixed
<anarsoul|2> (disassembler bug)
<anarsoul|2> also let's add less crazy ftrunc lowering...
dddddd has quit [Ping timeout: 258 seconds]
<rellla> is with the open MRs - i don't have ooms anymore.
<rellla> oom got fails, but i mentioned that already ;)
<anarsoul|2> rellla: yes, but it's just a workaround
<anarsoul|2> scheduler still fails, we just have a way to terminate it
<rellla> yeah, but at least skip list is shrinked
<anarsoul|2> well
<anarsoul|2> better ftrunc() lowering fixed some of the tests :)
<anarsoul|2> if not all
<anarsoul|2> let's throw it into CI
<anarsoul|2> well, it gets a good number of unexpected passes
<anarsoul|2> as well as good number of crashes
<anarsoul|2> hmm
<anarsoul|2> let's see :)
<anarsoul|2> UnexpectedPass: 97 Crash: 99
<anarsoul|2> it crashes in schedule_insert_ready_list()
<anarsoul|2> on list_addtail(&insert_node->list, insert_pos);
<anarsoul|2> and fixed.
<anarsoul|2> (also fixes 3 crashes from skip list)
<rellla> uahh
<anarsoul|2> :) [16395/16395] Pass: 15794 Fail: 0 Skip: 417 ExpectedFail: 87 UnexpectedPass: 97 Crash: 0 Timeout: 0 Missing: 0 Flake: 0 Duration: 3:42 Remaining: 0
<anarsoul|2> 97 more passes
yann has joined #lima
<rellla> so hooray, we are better than the blob now :)
<anarsoul|2> are we?
<rellla> wait, no - i think we have more skips/not supported ones...
<anarsoul|2> :)
<rellla> i will do a run with the same gles2-master.txt later with the blob...
<anarsoul|2> I'll look into ex-oom failures
<anarsoul|2> in gp compiler
<anarsoul|2> I guess next major feature that's missing is depth/stencil reload
<rellla> and probably we should look into viewport being saved and reloaded to memory within plbu
<anarsoul|2> we don't really need it with depth/stencil reload
<anarsoul|2> and it's not viewport, it's plbu state
<anarsoul|2> it makes code a lot more complex without obvious benefits
<anarsoul|2> they're likely using it in cases when they don't write depth/stencil buffer
<anarsoul|2> but we always do unless it's invalidated
<anarsoul|2> [16398/16398] Pass: 15894 Fail: 0 Skip: 417 ExpectedFail: 87 UnexpectedPass: 0 Crash: 0 Timeout: 0 Missing: 0 Flake: 0 Duration: 3:34 Remaining: 0
<anarsoul|2> anyway, it's pretty late
<anarsoul|2> good night
<rellla> :) see the blob results: Pass: 15893, so we have one more :)
<rellla> good night
<cwabbott> anarsoul|2: ugh, I see then... I would've thought that the nir out-of-ssa pass would never do something dumb like that
<cwabbott> it should only be using registers for things that are involved in phi-webs, which means that the definitions should be used by a phi (hence live out of the block)
<cwabbott> there could be a bug in there that's affecting quality or something
ecloud is now known as ecloud_wfh
_whitelogger has joined #lima
afaerber has quit [Quit: Leaving]
yann has quit [Read error: Connection reset by peer]
yann has joined #lima
afaerber has joined #lima
Barada has quit [Quit: Barada]
dddddd has joined #lima
afaerber has quit [Quit: Leaving]
<anarsoul|2> cwabbott: guess it tries to do something smart with loops?
<cwabbott> no, it shouldn't
<anarsoul|2> I can easily imaging phi having 3 sources
<cwabbott> phis in nir only have 3 sources with break or continue
<cwabbott> for an SSA dest to get converted to a register, it should have a use in a phi
<cwabbott> which means that it should be live-out of the basic block, because the phi-use is after the basic block
<cwabbott> and out-of-ssa doesn't change the liveness properties, or shouldn't
<cwabbott> so after out-of-ssa it should still be live out of the block
yann has quit [Ping timeout: 272 seconds]
<cwabbott> if it isn't live out, then out-of-ssa could've trivially left it alone
<cwabbott> I would dump the shader right before and after out-of-ssa to see if it's going wrong
<anarsoul|2> I can do that later if you want to take a peek, but I know very little about nir out-of-ssa pass to fix it
<cwabbott> is there anything between out-of-ssa and gpir codegen?
<anarsoul|2> NIR_PASS_V(s, nir_remove_dead_variables, nir_var_function_temp);
<anarsoul|2> I guess we still need to keep the fix in gpir
<cwabbott> for now, at least
<cwabbott> a bug with this result wouldn't usually impact correctness, so I can see it getting missed before
<anarsoul|2> and it usually gets optimized with copy propagation in backend
<anarsoul|2> so yeah, you usually look whether your disassembly looks good rather than nir :)
leidisaset has joined #lima
<leidisaset> work made a human from a monkey, frankly it is not something plaes have done during his mistaken career.
<leidisaset> you have so hilarious crooked outsiders in the business of authoring drivers that this is outstanding. Easy concepts are entirely not comprehended by such.
<leidisaset> plaes: in your mission to outerspace communication satellite formation or assembly, you used microchip microcontroller with Atmel ISA, which can expose virtual register files and have a jtag acceleration path.
<leidisaset> it naturally even the 8bit version has a carry out add instruction for multicycle
<leidisaset> addition with bigger datafields.
<leidisaset> if you were to play with this microcontroller or CPU accurately even with this one you can get vast performance out.
<leidisaset> microchip tech. is very big company it seems, and their hw should can be made to function very well too.
<leidisaset> I am kinda fed up of the situation that people complain about their computer issues to me only cause you do not know how to program.
<leidisaset> the checksum filesystem or virtual register files stored in 32bit variable as 1024 alinged values via the summee/summond/addend/addee works so that when you shift enough to the left
<leidisaset> you can calculate a new distance and subtract the shiting compensated value from the remainder.
<leidisaset> the bits that were not subtracted with using such masks, can be shifted in some amount of high bits or low bits.
<leidisaset> you are just filthy crooked blockers in my opinion having no clue whatsoever and PERIOD! It is terribly disgusting to see such outsiders in important positions.
leidisaset has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]
yann has joined #lima
afaerber has joined #lima
buzzmarshall has joined #lima
<anarsoul|2> rellla: you may want to cherry-pick last commit https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4110 into syscall tracker
<anarsoul|2> otherwise we won't see unary ops from acc in disassembly
<anarsoul|2> btw I'm not sure how to fix this infinite loop in gpir scheduler :(
<anarsoul|2> cwabbott: any suggestions on how to break a loop here? https://gist.github.com/anarsoul/613f107c81ce51d3a0bcf87691cc59ee
<anarsoul|2> looks like it can't schedule two movs (297 and 293) for ld_reg (98 and 94)
<anarsoul|2> and it keeps trying until it hits instruction limit
<anarsoul|2> I suspect the problem is that we have too many ld_reg
<anarsoul|2> shader itself is pretty simple, but we get complex nir because we don't support indexed stores
embed-3d has quit [Remote host closed the connection]
embed-3d has joined #lima
<Viciouss> I updated my kernel to 5.5 now to get rid of the scheduler crash, after that I found the debug switch in the drm code and enabled it, this output seems to be the cause for the hwc error message: [drm:drm_atomic_check_only] [PLANE:31:plane-0] invalid pixel format AB24 little-endian (0x34324241), modifier 0x0
<Viciouss> if I understand it correctly it means that what is written to the buffer is not what the drm driver is expecting, correct?
<anarsoul|2> AB24 is DRM_FORMAT_ABGR8888