#lima on 2020-03-10 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

00:01 <anarsoul|2> haha

00:01 <anarsoul|2> enunes: we're hitting exactly the same issue in gpir that you've seen in ppir regalloc

00:02 <anarsoul|2> https://gist.github.com/anarsoul/a1618731046dbb33775d2e1cb9c60dc1

00:02 <anarsoul|2> so r6 is technically a dead write (not used anywhere else)

00:03 <anarsoul|2> so it doesn't conflict with anything

00:04 <enunes> anarsoul|2: hah nice, is it also because of breaking the 'vec4' operator into scalar writes?

00:04 <anarsoul|2> nope

00:04 <anarsoul|2> it's because we're avoiding movs in gpir when translating from nir

00:05 <enunes> that was the main source of them for me that werent eliminated by nir dce

00:05 <anarsoul|2> see gpir_emit_alu()

00:05 <anarsoul|2> it's not dead in nir :)

00:06 <anarsoul|2> I guess I know how to fix it, we need pass that eliminates dead writes in addition to dce pass

00:10 <enunes> in gpir or nir?

00:19 <anarsoul|2> gpir

00:28 <anarsoul|2> yep, fixed now.

00:29 <anarsoul|2> will submit MR later today

00:31 yuq825 has joined #lima

01:59 _whitelogger has joined #lima

02:32 yuq825 has quit [Ping timeout: 240 seconds]

03:01 yuq825 has joined #lima

03:15 buzzmarshall has quit [Remote host closed the connection]

03:17 yuq825 has quit [Ping timeout: 258 seconds]

03:18 <Marex> anarsoul|2: hey so uh, on zynqmp, I got lima going with latest mesa and linux 5.4, but if I run GALLIUM_HUD="cpu,fps" kmscube, I see some really weird image corruption

03:18 <Marex> anarsoul|2: without GALLIUM_HUD, it works fine

03:18 <Marex> something that's known ?

03:19 <anarsoul|2> Marex: any pictures?

03:19 <Marex> anarsoul|2: is there a way to dump drm framebuffer ?

03:20 <anarsoul|2> don't tell me your phone can't make pictures :)

03:20 <Marex> gotta remove privacytape first, hold on

03:22 <Marex> anarsoul|2: but really, isn't there anything which can dump current frame from DRM device ?

03:23 <anarsoul|2> Marex: sorry, no idea. I have X11 and wayland (weston, sway) working so I don't really need to dump DRM buffer

03:48 <anarsoul|2> Marex: yeah, I see the same issue here

03:51 <Marex> anarsoul|2: ok good

03:54 <anarsoul|2> btw gallium hud works well in wayland

03:56 <anarsoul|2> i.e. for glmark2-es2-wayland

03:56 <anarsoul|2> but not for -drm

04:01 yuq825 has joined #lima

04:35 <marex-cloud> anarsoul|2: different pixel format maybe?

04:40 <anarsoul|2> both should be 32-bit

04:55 <marex-cloud> Maybe something reports some 16bit format somewhere

04:56 <marex-cloud> WL sure can be convinced to use RGB565 instead of RGBA8888

04:57 megi has quit [Quit: WeeChat 2.7.1]

04:58 megi has joined #lima

04:59 yuq825 has quit [Remote host closed the connection]

05:05 Barada has joined #lima

05:47 chewitt has quit [Quit: Zzz..]

06:54 <anarsoul|2> marex-cloud: maybe

07:20 kaspter has quit [Quit: kaspter]

07:21 kaspter has joined #lima

07:26 <rellla> anarsoul: nice respin.

07:27 <anarsoul|2> thanks

07:27 <rellla> seems the remaining issues are mostly tests, where ints are involved in the shader ...

07:27 <anarsoul|2> I'll look into some short failure

07:29 <anarsoul|2> also there're still shaders where scheduler fails

07:29 <anarsoul|2> those that were causing OOM

07:57 <anarsoul|2> heh, our disassembler can't handle this shader compiled by blob: https://gist.github.com/anarsoul/bc07c4263d4ea00738a88bec2f3be597

07:57 <anarsoul|2> I get https://gist.github.com/anarsoul/eec9955f7130f54d17f3e0eaaf063722

07:57 <anarsoul|2> note mul.m0 ^20 ^7 ^13

07:58 <anarsoul|2> but we don't have ^7 and ^13

08:01 <anarsoul|2> ^7 is acc0 output of instr 001

08:02 <anarsoul|2> ^13 is acc1 output of instr 002

08:07 <anarsoul|2> ah, got it.

08:12 yann has quit [Ping timeout: 255 seconds]

08:12 <anarsoul|2> fixed

08:12 <anarsoul|2> (disassembler bug)

08:23 <anarsoul|2> also let's add less crazy ftrunc lowering...

08:23 dddddd has quit [Ping timeout: 258 seconds]

08:23 <rellla> http://imkreisrum.de/deqp/lima_lima-improve-gp-disasm..8d0ec5b.master..9ff0ea3.head/

08:24 <rellla> is with the open MRs - i don't have ooms anymore.

08:24 <rellla> oom got fails, but i mentioned that already ;)

08:25 <rellla> iirc this is because of https://gitlab.freedesktop.org/mesa/mesa/-/commit/4d5a0ae22cf9ad893ddb10fca48e85e5dbf9c80c

08:25 <anarsoul|2> rellla: yes, but it's just a workaround

08:26 <anarsoul|2> scheduler still fails, we just have a way to terminate it

08:26 <rellla> yeah, but at least skip list is shrinked

08:27 <anarsoul|2> well

08:27 <anarsoul|2> better ftrunc() lowering fixed some of the tests :)

08:29 <anarsoul|2> if not all

08:29 <anarsoul|2> let's throw it into CI

08:35 <anarsoul|2> rellla: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126

08:43 <anarsoul|2> well, it gets a good number of unexpected passes

08:44 <anarsoul|2> as well as good number of crashes

08:44 <anarsoul|2> hmm

08:44 <anarsoul|2> let's see :)

08:45 <anarsoul|2> UnexpectedPass: 97 Crash: 99

08:51 <anarsoul|2> it crashes in schedule_insert_ready_list()

08:51 <anarsoul|2> on list_addtail(&insert_node->list, insert_pos);

08:53 <anarsoul|2> and fixed.

08:57 <anarsoul|2> (also fixes 3 crashes from skip list)

08:58 <rellla> uahh

09:06 <anarsoul|2> :) [16395/16395] Pass: 15794 Fail: 0 Skip: 417 ExpectedFail: 87 UnexpectedPass: 97 Crash: 0 Timeout: 0 Missing: 0 Flake: 0 Duration: 3:42 Remaining: 0

09:06 <anarsoul|2> 97 more passes

09:07 yann has joined #lima

09:11 <rellla> so hooray, we are better than the blob now :)

09:13 <anarsoul|2> are we?

09:14 <rellla> wait, no - i think we have more skips/not supported ones...

09:14 <rellla> http://imkreisrum.de/deqp/deqp-all-tests_mali400-r7p0_on-allwinner-a20/results.xml

09:14 <anarsoul|2> :)

09:14 <rellla> i will do a run with the same gles2-master.txt later with the blob...

09:14 <anarsoul|2> I'll look into ex-oom failures

09:14 <anarsoul|2> in gp compiler

09:17 <anarsoul|2> I guess next major feature that's missing is depth/stencil reload

09:19 <rellla> and probably we should look into viewport being saved and reloaded to memory within plbu

09:19 <anarsoul|2> we don't really need it with depth/stencil reload

09:20 <anarsoul|2> and it's not viewport, it's plbu state

09:20 <anarsoul|2> it makes code a lot more complex without obvious benefits

09:21 <anarsoul|2> they're likely using it in cases when they don't write depth/stencil buffer

09:21 <anarsoul|2> but we always do unless it's invalidated

09:21 <anarsoul|2> [16398/16398] Pass: 15894 Fail: 0 Skip: 417 ExpectedFail: 87 UnexpectedPass: 0 Crash: 0 Timeout: 0 Missing: 0 Flake: 0 Duration: 3:34 Remaining: 0

09:22 <anarsoul|2> https://gitlab.freedesktop.org/anarsoul/mesa/-/jobs/1884437

09:23 <anarsoul|2> anyway, it's pretty late

09:24 <anarsoul|2> good night

09:24 <rellla> :) see the blob results: Pass: 15893, so we have one more :)

09:24 <rellla> good night

09:27 <cwabbott> anarsoul|2: ugh, I see then... I would've thought that the nir out-of-ssa pass would never do something dumb like that

09:29 <cwabbott> it should only be using registers for things that are involved in phi-webs, which means that the definitions should be used by a phi (hence live out of the block)

09:30 <cwabbott> there could be a bug in there that's affecting quality or something

10:01 ecloud is now known as ecloud_wfh

11:29 _whitelogger has joined #lima

12:08 afaerber has quit [Quit: Leaving]

13:13 yann has quit [Read error: Connection reset by peer]

13:29 yann has joined #lima

13:56 afaerber has joined #lima

14:08 Barada has quit [Quit: Barada]

14:12 dddddd has joined #lima

15:10 afaerber has quit [Quit: Leaving]

17:03 <anarsoul|2> cwabbott: guess it tries to do something smart with loops?

17:03 <cwabbott> no, it shouldn't

17:03 <anarsoul|2> I can easily imaging phi having 3 sources

17:04 <cwabbott> phis in nir only have 3 sources with break or continue

17:05 <cwabbott> for an SSA dest to get converted to a register, it should have a use in a phi

17:06 <cwabbott> which means that it should be live-out of the basic block, because the phi-use is after the basic block

17:06 <cwabbott> and out-of-ssa doesn't change the liveness properties, or shouldn't

17:06 <cwabbott> so after out-of-ssa it should still be live out of the block

17:07 yann has quit [Ping timeout: 272 seconds]

17:07 <cwabbott> if it isn't live out, then out-of-ssa could've trivially left it alone

17:08 <cwabbott> I would dump the shader right before and after out-of-ssa to see if it's going wrong

17:10 <anarsoul|2> I can do that later if you want to take a peek, but I know very little about nir out-of-ssa pass to fix it

17:12 <cwabbott> is there anything between out-of-ssa and gpir codegen?

17:12 <anarsoul|2> NIR_PASS_V(s, nir_remove_dead_variables, nir_var_function_temp);

17:13 <anarsoul|2> I guess we still need to keep the fix in gpir

17:14 <cwabbott> for now, at least

17:16 <cwabbott> a bug with this result wouldn't usually impact correctness, so I can see it getting missed before

17:17 <anarsoul|2> and it usually gets optimized with copy propagation in backend

17:18 <anarsoul|2> so yeah, you usually look whether your disassembly looks good rather than nir :)

17:40 <anarsoul|2> cwabbott: https://gist.github.com/anarsoul/1138c78c9e4e5ae6c35d2503b253e5ad

18:02 leidisaset has joined #lima

18:32 <leidisaset> work made a human from a monkey, frankly it is not something plaes have done during his mistaken career.

18:34 <leidisaset> you have so hilarious crooked outsiders in the business of authoring drivers that this is outstanding. Easy concepts are entirely not comprehended by such.

18:36 <leidisaset> plaes: in your mission to outerspace communication satellite formation or assembly, you used microchip microcontroller with Atmel ISA, which can expose virtual register files and have a jtag acceleration path.

18:37 <leidisaset> it naturally even the 8bit version has a carry out add instruction for multicycle

18:37 <leidisaset> addition with bigger datafields.

18:38 <leidisaset> if you were to play with this microcontroller or CPU accurately even with this one you can get vast performance out.

18:39 <leidisaset> microchip tech. is very big company it seems, and their hw should can be made to function very well too.

18:52 <leidisaset> I am kinda fed up of the situation that people complain about their computer issues to me only cause you do not know how to program.

18:54 <leidisaset> the checksum filesystem or virtual register files stored in 32bit variable as 1024 alinged values via the summee/summond/addend/addee works so that when you shift enough to the left

18:55 <leidisaset> you can calculate a new distance and subtract the shiting compensated value from the remainder.

18:56 <leidisaset> the bits that were not subtracted with using such masks, can be shifted in some amount of high bits or low bits.

19:06 <leidisaset> you are just filthy crooked blockers in my opinion having no clue whatsoever and PERIOD! It is terribly disgusting to see such outsiders in important positions.

19:08 leidisaset has quit [Quit: KVIrc 5.0.0 Aria http://www.kvirc.net/]

19:47 yann has joined #lima

19:55 afaerber has joined #lima

20:20 buzzmarshall has joined #lima

21:55 <anarsoul|2> rellla: you may want to cherry-pick last commit https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4110 into syscall tracker

21:56 <anarsoul|2> otherwise we won't see unary ops from acc in disassembly

22:49 <rellla> anarsoul|2: already done: https://gitlab.freedesktop.org/lima/mali-syscall-tracker/-/merge_requests/11/commits

23:06 <anarsoul|2> btw I'm not sure how to fix this infinite loop in gpir scheduler :(

23:19 <anarsoul|2> cwabbott: any suggestions on how to break a loop here? https://gist.github.com/anarsoul/613f107c81ce51d3a0bcf87691cc59ee

23:21 <anarsoul|2> looks like it can't schedule two movs (297 and 293) for ld_reg (98 and 94)

23:24 <anarsoul|2> and it keeps trying until it hits instruction limit

23:37 <anarsoul|2> I suspect the problem is that we have too many ld_reg

23:41 <anarsoul|2> shader itself is pretty simple, but we get complex nir because we don't support indexed stores

23:41 <anarsoul|2> https://gist.github.com/anarsoul/554eb7c795232465bdd1a6308f6cb6f0

23:42 embed-3d has quit [Remote host closed the connection]

23:43 embed-3d has joined #lima

23:55 <Viciouss> I updated my kernel to 5.5 now to get rid of the scheduler crash, after that I found the debug switch in the drm code and enabled it, this output seems to be the cause for the hwc error message: [drm:drm_atomic_check_only] [PLANE:31:plane-0] invalid pixel format AB24 little-endian (0x34324241), modifier 0x0

23:56 <Viciouss> if I understand it correctly it means that what is written to the buffer is not what the drm driver is expecting, correct?

23:56 <anarsoul|2> AB24 is DRM_FORMAT_ABGR8888