#lima on 2020-01-03 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

00:10 <anarsoul> rellla: oh, I think I know what's the issue

00:12 <rellla> ?

00:12 <anarsoul> working on fix

00:12 <anarsoul> will be ready in a min

00:13 <anarsoul> https://gist.github.com/anarsoul/22ca92618c7ca835b01ef61bc41b5542

00:13 <anarsoul> I'll be quite surprised if it doesn't fix the issue :)

00:14 <anarsoul> we underallocate space for varyings, gl_pos and gl_pointsize for indexed draw

00:14 <anarsoul> so gl_pos overwrites varyings

00:14 <anarsoul> thus wrong color

00:15 <rellla> sounds very reasonable, but i can sadly test it tomorrow first :p

00:15 <anarsoul> :)

00:15 <anarsoul> I can give it a run via CI

00:16 <rellla> yeah, then we finally can jump over to clipping :)

00:17 <anarsoul> that should fix *a lot* of weird failures

00:30 <anarsoul> cwabbott_: any plans to resume work in https://gitlab.freedesktop.org/cwabbott0/mesa/tree/lima-gpir-branch-opt-v4 ?

00:36 <anarsoul> rellla: yeah, it passes now

00:54 <anarsoul> interestingly it didn't fix any other tests from my list :)

00:54 <anarsoul> https://gitlab.freedesktop.org/anarsoul/mesa/-/jobs/1261916

00:54 <anarsoul> I'll submit MR later tonight

01:32 yuq825 has joined #lima

01:39 <yuq825> hi guys, need your review for the kernel patch: https://patchwork.kernel.org/patch/11315037/

01:41 <anarsoul> yuq825: I've seen it but haven't gotten to it yet. Will do in few days

01:41 <yuq825> OK, thanks

02:14 megi has quit [Ping timeout: 240 seconds]

02:58 dddddd has quit [Ping timeout: 258 seconds]

03:25 chewitt has joined #lima

03:36 hell__ has quit [Ping timeout: 250 seconds]

03:36 hellsenberg has joined #lima

05:19 Barada has joined #lima

05:25 Barada has quit [Quit: Barada]

05:35 Barada has joined #lima

05:37 <anarsoul> enunes: rellla: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3266

05:50 <anarsoul> yuq825: ^^ you should also take a look

05:50 <anarsoul> I'm not sure how we missed that

05:59 chewitt has quit [Quit: Zzz..]

06:13 <yuq825> oh, right, does the origin way cause crash or problems?

06:14 <anarsoul> yes, dEQP-GLES2.functional.buffer.write.use.index_array.array fails

06:14 <anarsoul> if number of VS invocations is higher than info->count gl_pos overwrites varyings

06:16 <yuq825> I can't see how VS invocation will be bigger than info->count?

06:16 <anarsoul> because it's = (ctx->max_index - ctx->min_index + 1)

06:17 <anarsoul> that's what we pass to VS_CMD_DRAW()

06:21 <yuq825> ok, so it's caused by the command stream setting? I can see neither way of info->count or max-min way is accurate for the output space needed, but at least we should use the correct one

06:22 <anarsoul> it's supposed to be ctx->varyings_stride * num of VS invocations

06:27 <yuq825> then for index array [0, 1000, 2], we waste a lot of VS invocation and mem space for output

06:28 <anarsoul> yeah

06:28 <anarsoul> I guess that's why blob optimizes it :)

06:29 <anarsoul> see VS CMD STREAM here: http://imkreisrum.de/deqp/index_array.array_mali/mali.dump.0000

06:29 <anarsoul> rellla: ^^ and that's probably the answer for your question why blob splits VS_CMD_DRAW into several

06:46 <yuq825> lucky if we can divide a list of index, but what about single triangle with [0, 1000, 2]? can blob still divide it or just do 1001 VS?

06:51 yuq825 has quit [Ping timeout: 268 seconds]

06:54 yuq825 has joined #lima

07:02 Barada has quit [Quit: Barada]

07:02 <anarsoul> yuq825: I don't know the answer for that

07:02 <anarsoul> but looks like utgard architecture is not optimized for indexed draws

07:03 <anarsoul> and in worst case we have to shade every vertex

07:06 <yuq825> could you do a blob dump for this case to confirm, maybe there is some hidden command stream?

07:06 <anarsoul> also I'm not sure how common it is to have vertex buffer larger than index

07:06 <anarsoul> yuq825: I have pretty limited access to the hardware for next few weeks

07:06 <anarsoul> I can add it to my TODO though

07:07 <anarsoul> I doubt there's hidden command though

07:07 <yuq825> fine, I can give a try latter tody

07:08 <yuq825> me too, just for confirmation

07:34 yuq825 has quit [Ping timeout: 265 seconds]

07:48 yuq825 has joined #lima

07:58 yuq825 has quit [Ping timeout: 240 seconds]

08:12 yuq825 has joined #lima

09:08 Barada has joined #lima

10:03 abordado has joined #lima

10:17 Barada has quit [Quit: Barada]

10:19 abordado has quit [Remote host closed the connection]

10:19 abordado has joined #lima

10:22 abordado has quit [Client Quit]

10:55 yuq825 has quit [Remote host closed the connection]

11:06 megi has joined #lima

11:33 hellsenberg is now known as hell__

12:27 abordado has joined #lima

12:54 drod has joined #lima

14:00 dddddd has joined #lima

14:01 robertfoss has quit [Ping timeout: 265 seconds]

14:06 robertfoss has joined #lima

15:02 buzzmarshall has joined #lima

17:06 buzzmarshall has quit [Remote host closed the connection]

17:28 chewitt has joined #lima

18:00 chewitt has quit [Quit: Zzz..]

18:59 <enunes> anarsoul: btw, in the last days I've been looking at the ppir regalloc to see if I can make work more efficiently to not fail with the remaining glamor shaders from shader-db

18:59 <anarsoul> well

18:59 <enunes> I implemented a way to consider register pressure in spilling, that improved a few cases but still didn't solve regalloc for those

18:59 <anarsoul> enunes: we need to implement optimization passes for ppir

19:00 <anarsoul> e.g. we need copy propagation and dce

19:00 <anarsoul> copy propagation should improve reg pressure

19:01 <anarsoul> also we may want to reuse LRCA algo that is used in panfrost for RA

19:02 <enunes> I'll look into these options, I mentioned mostly to ensure you're not working on something similar

19:02 <anarsoul> not atm

19:02 <anarsoul> I'm focusing on command stream

19:03 <anarsoul> speaking of which

19:03 <enunes> ok, sounds good

19:04 <anarsoul> I ran q3a with https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2283 and it didn't cause any GPU errors

19:05 <anarsoul> so please review it when you get some time

19:05 <anarsoul> cursor movements in X11 should be smooth with this MR :)

19:07 <anarsoul> I have strong suspicion that due to underallocation fixed in 3266 GPU overwrote part of command and/or PLBU stream with varyings and/or gl_pos and that causes random GPU errors

19:22 adjtm_ has joined #lima

19:24 adjtm has quit [Ping timeout: 240 seconds]

19:43 afaerber has quit [Ping timeout: 250 seconds]

19:48 afaerber has joined #lima

20:05 <rellla> anarsoul: can you confirm, that the second attributes desc address here http://imkreisrum.de/deqp/index_array.array_mali/mali.dump.0001

20:06 afaerber has quit [Quit: Leaving]

20:07 <rellla> sry forget it

20:09 <rellla> anarsoul: it's that one: http://imkreisrum.de/deqp/index_array.array_mali/mali.dump.0002

20:09 <rellla> second /* 0x10090468 (0x00000068) */0x10095fc0 0x20040000/* ATTRIBUTES_ADDRESS: address: 0x10095fc0, size: 2 */

20:10 <rellla> checking 0x10095fc0 gives me 0x3e8e155d, 0x3e9861f5, 0x3e34ebe8, 0x3ec89941, /* 0x00001BC0 */

20:10 <rellla> which looks strange to me. can you confirm that?

20:12 <rellla> /* 0x00001BA0 */ looks like the right address

20:18 afaerber has joined #lima

20:19 <rellla> this seems to be strange only in the draw_arrays dumps - each second one of http://imkreisrum.de/deqp/index_array.array_mali/

20:20 <rellla> for draw_elements it seems to be right. let me upload the new dumps with vary and attr desc parsed in a few minutes...

20:47 <rellla> online now-

20:48 <rellla> @all: if you want me to upload deqp mali dumps, let me know. i have all locally, sadly they are 84GB in size, so uploading seems a bit difficult :p

21:09 drod has quit [Excess Flood]

21:11 <anarsoul> ouch

21:12 <anarsoul> rellla: note command difference

21:12 <anarsoul> oh, nevermind

21:17 <anarsoul> rellla: yeah, it's weird.

21:22 <anarsoul> rellla: could be a bug in blob? :)

21:30 <anarsoul> rellla: so looks at semaphores

21:30 <anarsoul> PLBU waits for VS to complete shading here: /* 0x10092490 (0x00000090) */0x00010001 0x60000000/* ARRAYS_SEMAPHORE_END */

21:30 <anarsoul> VS signals completion here: /* 0x10090448 (0x00000048) */0x00018000 0x50000000/* SEMAPHORE_END: index_draw enabled */

21:31 <anarsoul> so technically whatever "/* 0x10090478 (0x00000078) */0x1b000001 0x00000000/* DRAW: num: 27, index_draw: true */" does is even not rasterized?

21:31 <anarsoul> maybe kernel driver just swallows GPU error?

21:34 <rellla> so everything after /* 0x10090448 (0x00000048) */0x00018000 0x50000000/* SEMAPHORE_END: index_draw enabled */ is probably bogus and not needed?

21:35 <anarsoul> actually

21:35 <anarsoul> .frame.vs_commands_end = 0x10090450

21:36 <anarsoul> I'm not sure why you're parsing it further :)

21:36 <rellla> f*** it's me :p

21:37 <rellla> have to think about it again, but maybe my comment here https://gitlab.freedesktop.org/lima/mali-syscall-tracker/blob/master/main.c#L1251 shows result :p

21:41 <anarsoul> :)

21:41 <anarsoul> it's easier with PLBU where it has explicit end frame cmd

22:04 <anarsoul> rellla: can you review https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2283 ?

22:04 <anarsoul> it'd be nice to merge it before 20.0 is branched out

22:15 BenG83 has joined #lima

22:37 BenG83 has quit [Ping timeout: 258 seconds]

22:58 <rellla> anarsoul: ok.

22:58 <rellla> i think i fixed the cmds end issue: https://gitlab.freedesktop.org/rellla/mali-syscall-tracker/commits/parse_vary_attr

22:59 <anarsoul> great

22:59 <anarsoul> post an MR? :)

22:59 <rellla> i haven't tested the case, when the cmd stream is continued at another address. ideas should trigger that, but i haven't set this up

22:59 <rellla> done

22:59 <rellla> i will do some dumps tomorrow.

23:00 <rellla> MR see https://gitlab.freedesktop.org/lima/mali-syscall-tracker/merge_requests/3

23:31 <rellla> anybody here that has glmark2 working with blob (would prevent me from setting it up :p)

23:31 <rellla> ?

23:32 <anarsoul> nope