ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!
megi has quit [Ping timeout: 268 seconds]
camus has joined #lima
kaspter has quit [Ping timeout: 268 seconds]
camus is now known as kaspter
<anarsoul> enunes: technically it's not a regression, it just exposes a bug in gpir
dddddd has quit [Remote host closed the connection]
Barada has joined #lima
Barada has quit [Quit: Barada]
Barada has joined #lima
Barada has quit [Quit: Barada]
kaspter has quit [Quit: kaspter]
kaspter has joined #lima
tlwoerner has quit [Ping timeout: 240 seconds]
tlwoerner has joined #lima
BenG83 has joined #lima
Barada has joined #lima
Barada has quit [Quit: Barada]
dddddd has joined #lima
Rondom has quit [Remote host closed the connection]
<rellla> enunes: it's dEQP-GLES2.functional.shaders.loops.while_constant_iterations.nested_sequence_vertex which triggers the following kernel error:
<rellla> Dec 30 12:31:53 opipc2 kernel: [953644.067375] [drm:lima_sched_timedout_job [lima]] *ERROR* lima job timeout
<rellla> Dec 30 12:31:53 opipc2 kernel: [953644.074304] lima 1e80000.gpu: gp task error int_state=0 status=8a
<rellla> all following tests fail...
<enunes> rellla: hmm ok, what I wonder next is which vs setting of that patch exposes the bug, if it's the const buffer size ones or loop unrolling
<enunes> and since it's likely not an introduced bug but probably exposing an existing bug, if we should revert that particular setting or not
<rellla> enunes: i can try to play with the values and see which one it is
<rellla> enunes: https://pastebin.com/raw/GuQhJvwb that one fixes at least this test again. better said, this is the reason, why the bug is exposed :p
<rellla> so if we are sure, that it's a VS bug, we should keep the patch and adapt the skips and fails lists. the better solution would be to fix the bug now :)
<enunes> hmm yeah thats bad, for that one I think it's even more likely it's an existing bug
<enunes> if you have time, could you try to rebase 00:51 [ _whitelogger ] [ cp- ] [ embed-3d ] [ lvrp16 ] [ Net147 ] [ sphalerite ]
<enunes> oops, sorry
<rellla> and use it together with that patch?
<enunes> yes, on master as-is
<rellla> i tried it some time ago but ran into different fails.
<rellla> i wonder if i should take cwabbott's https://gitlab.freedesktop.org/cwabbott0/mesa/tree/lima-gpir-branch-opt-v4 instead?
<enunes> if it's a newer iteration, makes sense
<rellla> i'm not sure how "stable" it is. i will give it a try in about 1h
<enunes> I'm mostly interested about that particular test that regressed with the new cap
<rellla> anarsoul: in case i didn't tell you already, here is the shader dump you asked me: http://imkreisrum.de/deqp/index_array.array_lima/
<rellla> enunes: yeah, i will only test that one for now
armessia has joined #lima
<rellla> enunes: the test passes with lima-gpir-branch-opt-v4 on top of current master
megi has joined #lima
<rellla> i will run a full test suite now.
<enunes> rellla: well that is helpful, thanks a lot for testing it
<rellla> enunes: i think, we should keep the caps patch as is for now and i will sort out my skips and fails list
<enunes> so we have to decide, I suppose we can just skip that test for now, any opinion anarsoul ?
megi has quit [Ping timeout: 265 seconds]
<rellla> enunes: i'll give up with the full run with cwabbotts patches. i get too much crashes to sort out, so i think i'll wait until it's finished and go ahead with master
<enunes> fair enough
megi has joined #lima
<rellla> enunes: i noticed, that dEQP-GLES2.functional.shaders.loops.while_dynamic_iterations.nested_sequence_vertex was already on my skips list, because it also crashed the driver. without your patches :)
<rellla> so both, *.while_[dynamic|constant]_iterations.nested_sequence.vertex crash the driver.
danqo has quit [Remote host closed the connection]
<MoeIcenowy> rellla: what kernel are you on?
<MoeIcenowy> we should prevent the driver from being crashed
<MoeIcenowy> it's part of userspace-kernelspace isolation
<anarsoul> enunes: yeah, I think we can skip it for now
<anarsoul> unless you want to dive into gpir compiler :)
<anarsoul> rellla: there's nothing suspicious in disassembly
<anarsoul> so it must be something with command stream
<rellla> MoeIcenowy: Linux opipc2 5.4.0-11681-g63de37476ebd #13 SMP PREEMPT Wed Dec 4 10:59:26 CET 2019 aarch64 GNU/Linux
<rellla> iirc i have all relevant lima commits included
<anarsoul> rellla: so color is taken from varying, sounds like we have some issue with varying setup
<anarsoul> maybe varying pointer requires some extra alignment?
<anarsoul> basically my commit shifted varyings by gl_Position size (that's 16 * number of vertices)
<anarsoul> it's still aligned to 64 bytes though
<rellla> anarsoul: at least the second value of the varying besides the address is different: 0x0000400f for mali, 0x00008002 in lima. so a few bits of our lima_update_varying differ in the end
<anarsoul> rellla: likely blob uses different varying format
<anarsoul> I assume it's fp16
<anarsoul> i.e. 0x8002 is vec4 fp32, 0x400f vec4 fp16
<rellla> ok
<anarsoul> value for varying is taken from attribute
<anarsoul> so it either reads attributes incorrectly
<anarsoul> or writes varyings incorrectly
<rellla> apart from the varying, doesn't the viewport smell suspicious, too?
<anarsoul> not really
<rellla> anarsoul: i did a new dump with current master http://imkreisrum.de/deqp/index_array.array_lima.new/ . i don't know, why some things are different now, the second value of the attributes' info for example...
<anarsoul> which frame should I look into?
<rellla> it's dEQP-GLES2.functional.buffer.write.use.index_array.array with current mesa master 824bd0830e8
<anarsoul> 0x10095440, 0x00006002, 0x10095a80, 0x00004001 in blob
<anarsoul> 0x00496040, 0x00004001, 0x00496830, 0x00006002 in lima
<anarsoul> likely it just reordered attributes
warpme_ has quit [Quit: Connection closed for inactivity]
<rellla> i still don't get, why mali has 3 DRAWs in VS CMD, with num 24, 183 and 27 and one /* DRAW_ARRAYS: count: 129, start: 0, mode: 3 (0x3) */ in the plbu afterwards
<anarsoul> likely wallpapering?
<rellla> shouldn't plbu and vs draw nums correspond to eachother?
<anarsoul> I'm more interested in why there's /* DRAW: num: 254, index_draw: true */ in http://imkreisrum.de/deqp/index_array.array_lima.new/lima.dump.0001
<anarsoul> rellla: no if you don't want to rasterize it?
<anarsoul> or for wallpapering we actually have just PLBU job since we don't need to shade the vertices
<rellla> the 254 is suspicious, isn't it?
<anarsoul> e.g. we produce gl_Position and varyings buffers in software, then just feed it to PLBU and PP
<anarsoul> rellla: yeah, I'm not sure where it's coming from
<rellla> it's (129 - 2) * 2 :)
<anarsoul> where it's coming from?
<rellla> let me check the parser though ...
<anarsoul> so it's "ctx->max_index - ctx->min_index + 1"
<rellla> yes, if my parser decodes it correct.
armessia has quit [Quit: Leaving]
<anarsoul> what if you just use info->count as num?
<anarsoul> this part looks really fishy to me :)
<rellla> i think, parser is wrong, too :p let me check
<rellla> shouldn't matter for small values though
<anarsoul> nah, actually max_index - min_index + 1 is correct
<anarsoul> I'm not sure whether we should have the same in "PLBU_CMD_DRAW_ELEMENTS" though
<rellla> \o/ passed let me check the images...
<anarsoul> ouch
<anarsoul> there's more frames now
<rellla> yes, but blue one. no lines anymore :)
<anarsoul> heh
<anarsoul> I wonder why it pass :\
<rellla> i think, that because of the way deqp creates and compares with the reference image. anyway, it's obviously wrong.
<anarsoul> it's yellow to white gradient, (255, 255, 127) to (255, 255, 255)
<anarsoul> you know what, it looks like some vertices are not shaded
<rellla> one more:
<rellla> the command line output. why do we have a NULL pointer in the first glDrawElements?
<anarsoul> other pointers are also NULL (just with some offset)
<anarsoul> maybe mesa is smart and just specifies offset?
<rellla> maybe
<anarsoul> 0x81 is 129
<anarsoul> rellla: "/* 0x00497578 (0x00000038) */0xfe000001 0x00000000/* DRAW: num: 254, index_draw: true */" doesn't look correct to me
<anarsoul> it's supposed to be 129
<rellla> so lets make it 129
<anarsoul> like hardcode it?
<anarsoul> sure, why not :)
<rellla> :) doesn't work
<anarsoul> remove previous hack
<anarsoul> can you dump ctx->max_index and ctx->min_index?
<rellla> the lines, that are drawn, seem right, but they are too less
<rellla> this is with hardcoded 129
<anarsoul> can you change it to 253?
<anarsoul> also please dump ctx->max_index and ctx->min_index
<rellla> lines seem to ok now, but color isn't
<anarsoul> picture is wrong, but it passes?
<anarsoul> wtf?
<anarsoul> can you dump varyings?
<anarsoul> IIRC there was some code to do that
<anarsoul> yeah, at least gl_pos is supposed to be here
<anarsoul> rellla: can you also dump info->index_bias?
<anarsoul> (that should be faster than dumping varyings)
<rellla> :)
<rellla> bias is 0 all the time (with 253 hardcoded)
<anarsoul> drop hardcoding 253 since it doesn't help
<rellla> persistent bug :) anyway, i have to stop for today, bedtime...
<rellla> if you have any good ideas, the rest of the day, let me know :)
<anarsoul> OK, good night :)
<anarsoul> rellla: I have strong suspicion that it has something to do with attributes
<anarsoul> 1) we have to have aligned addresses for them
<anarsoul> 2) double check all the calculations