ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at and - Contact ARM for binary driver support!
<anarsoul|c> Yeah, but t let's leave it for later
jrmuizel has quit [Remote host closed the connection]
yuq825 has joined #lima
dddddd has quit [Remote host closed the connection]
forkbomb has quit [Remote host closed the connection]
forkbomb has joined #lima
Barada has joined #lima
<anarsoul> enunes: we can do something similar to what I did to constants, i.e. clone uniforms on every usage
<anarsoul> but we can leave it for later
<anarsoul> let's land control flow support first, I'd really like to avoid introducing any fancy lowerings since they likely will break control flow if it's not here yet
Elpaulo has quit [Read error: Connection reset by peer]
Elpaulo has joined #lima
Barada has quit [Quit: Barada]
Barada has joined #lima
sphalerite_ has joined #lima
sphalerite has quit [Quit: WeeChat 2.4]
sphalerite_ is now known as sphalerite
_whitelogger has joined #lima
<rellla> anarsoul: enunes: i finally got my H5 and H3 set up. so i have now setups for all different Mali4** running: A10 -Mali400, H3 -Mali400-MP2, H5 -Mali450 ...
<rellla> so if you like me to do tests on different platforms, please ping me
<rellla> ... now starting a piglit run on current master on all 3 ...
yuq825 has quit [Remote host closed the connection]
cwabbott has quit [Quit: cwabbott]
cwabbott has joined #lima
dddddd has joined #lima
<rellla> can i run piglit headless?
adjtm_ has quit [Ping timeout: 248 seconds]
<enunes> rellla: hmm yeah with gbm backend
<enunes> I run it on a board that doesn't have any display output connector
<rellla> enunes: right, now it runs. we have some right issues with default /dev/dri/* devices.
<enunes> rellla: hmm yeah I have a local modprobe.conf file with dependencies to load lima after sun4i-drm
<rellla> do i need to set some other rights somewhere, as sudo runs fine, whereas running it as normal users fails all tests?
<enunes> I run it as root on the test systems, but maybe you need to add your user to groups 'video' or 'render' or something like that, based on groups in /dev/dri
<rellla> card0 is root:video, renderD128 is root:render, user is member of video and render...
<rellla> strange ...
megi has joined #lima
<rellla> enunes: doing a re-login after adding the user to the groups does the trick :p
ecloud has quit [Ping timeout: 245 seconds]
ecloud has joined #lima
yuq825 has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
yuq825 has quit [Remote host closed the connection]
ninolein has joined #lima
yuq825 has joined #lima
Barada has quit [Quit: Barada]
jrmuizel has joined #lima
jrmuizel has quit [Ping timeout: 272 seconds]
forkbomb has quit [Quit: In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.]
forkbomb has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Ping timeout: 244 seconds]
<rellla> heyo, i've done some new piglit runs on master and with anarsoul's cf branch, see here:
<rellla> i gave them a go on 400 and 450, only regression on 450 from dmesg-fail to fail, which shouldn't be related to cf
<rellla> s/only/only one/
adjtm has joined #lima
<rellla> seems there isn't much left to do on the ppir side except what was mentioned above. nir_ssa_undef_instr is also missing - i'll have a look at that
<enunes> rellla: dmesg-fail might happen without
<enunes> there might be unstable tests without it due to
<enunes> but the patchset needs another respin
<rellla> sure. i also encounter issues, where the default piglit accurancy of 0.01 makes some tests fail
<enunes> well at least they should be consistently failing or passing
<enunes> comparing 400 and 450 on that should be interesting result
<cwabbott> rellla: the accuracy thing is expected, it's because pp uses half-floats but desktop GL assumes that you have full 32-bit floats, exposing legacy GL support is kinda a hack in the first place
<cwabbott> *desktop GL support
<cwabbott> there's a way to only advertise support for mediump (half-precision) in GLES which the blob uses, but it's probably not hooked up in mesa
jrmuizel has joined #lima
<rellla> cwabbott, would it be an option, to temporarily hack the accurancy to 0.02 or even 0.03 in piglit to pass the tests?
<cwabbott> that's not something that could be upstreamed in piglit
gcl has joined #lima
<rellla> not for upstream, just local. just for now.
jrmuizel has quit [Remote host closed the connection]
<cwabbott> I suppose, although maybe the better thing long-term would be to advertise only mediump support, and then only look at gles2 results
<cwabbott> there are always going to be a bunch of known failures with desktop gl
<rellla> ok.
<rellla> cwabbott: where is the best place to put a nir pass, that is shared between panfrost and lime? compiler/nir ?
<rellla> s/lime/lima/
<cwabbott> no idea, although you shouldn't need that pass at all
<cwabbott> when going out of SSA, undef is just a register that is read and never written
jrmuizel has joined #lima
<cwabbott> again, just turn each undef into a register that is never written
<anarsoul> cwabbott: it's cheaper to turn it into const
<rellla> ok
<rellla> iirc, using the nir pass fixed it.
<cwabbott> anarsoul: I don't think so, that will take up an extra const
<anarsoul> consts can be inserted to any instruction and they require no regs
<cwabbott> but too many consts will split up the bundle
<cwabbott> I guess you'll have to mark the register as undef so that you won't make it interfere with anything
<anarsoul> cwabbott: that'll be more code for no benefit
<cwabbott> anarsoul: it is a slight benefit
jrmuizel has quit [Remote host closed the connection]
<cwabbott> if you do it after out-of-ssa, it should only matter for "bad" code like this test though
<cwabbott> but in i965 there was a significant benefit from making ssa_undef handling even more relaxed, so this "bad" code is unfortunately quite common
<anarsoul> cwabbott: fair enough
yuq825 has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Ping timeout: 272 seconds]
jrmuizel has joined #lima
jrmuizel has quit [Ping timeout: 268 seconds]
<anarsoul> enunes: I split pp cf commit into several smaller commits, so it's easier to review now
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
<anarsoul> rellla: we also need support for different sampler types
<rellla> anarsoul: yes, as i meant - except what you and enunes already discussed yesterday...
<anarsoul> oh, OK
<anarsoul> (no one is working on sampler types btw)
<anarsoul> it'd be really nice to get support for samplerCube
<anarsoul> but it requires some RE of command stream since we don't know descriptor format for cube textures
<anarsoul> (probably it's similar to 2D with few flags set here and there)
<rellla> i have done that quick hack to support undef with a lowering to const like others do it...
<anarsoul> rellla: I think cwabbott is right and we just need to make it a register that doesn't conflict with anything else
<rellla> needs refactoring to share the code, but i'm not sure anymore if i should think about the reg solution
<anarsoul> it shouldn't be difficult, just introduce new reg flag (undef) and if it's set never mark this reg as interfering with other regs in regalloc
<anarsoul> then just create a dummy op for undef with ppir_dest that contains a reg that has this flag set
<anarsoul> I mean ppir_node with ppir_op_dummy
<rellla> hm, ok. and lowering to that reg happens during ppir lowering?
<anarsoul> don't add this node to node_list
<anarsoul> rellla: no lowering is necessary
<anarsoul> well, you may want to introduce a lowering pass that removes dummy nodes from nodes list (but don't free it - we need its ppir_dest)
<anarsoul> rellla: and please work on top of my branch :)
<rellla> no, not lower. just create the reg node within ppir_emit_ssa_undef?
<rellla> i will look into it. and yes, i'll take your branch :)
<anarsoul> rellla: yes, ppir_node_create_reg() with op = ppir_op_dummy
<rellla> i think, i'm able to do this...
<anarsoul> rellla: then add lowering pass that calls list_del(&node->list); for nodes with op = ppir_op_dummy
<anarsoul> rellla: also add undef flag to ppir_reg and set it in emit_ssa_undef
<anarsoul> rellla: and then set interference = false if this flag is set in ppir_regalloc_prog_try()
<rellla> ok, thanks. good mini-howto :p
<anarsoul> as result regalloc will pick any reg for it
<anarsoul> and it won't increase reg pressure nor use const nodes
<anarsoul> and result will be undef
<anarsoul> (however it doesn't mean that it won't be equal to 5.0, so test may fail)
<anarsoul> :D
<rellla> in practice, this shader shouldn't appear out there anyway, except we meet "bad" code!?
<anarsoul> rellla: yeah, undef shouldn't appear unless there's a bug in shader
<anarsoul> I can be wrong here though but I don't know any scenarios when it can appear otherwise
<rellla> what is this else for? is't the whole if-then-else obsolete atm?
<rellla> sry, wrong branch.
<anarsoul> basically two regs interfere if their live ranges intersect
<anarsoul> IIRC I verified this code, feel free to double check it
<rellla> sry for the noise. my misreading.
<anarsoul> this 'else' hits if reg1->live_in == reg2->live_in
<rellla> got it
<anarsoul> enunes: I briefly looked through ideas-lamp-lit shader and looks like fusing condition into branch won't help it
<anarsoul> it uses something like "vec1 1 ssa_181 = feq ssa_178, ssa_180; if ssa_181 { ...}"
<anarsoul> well, maybe not the best example
<anarsoul> "vec1 1 ssa_46 = iand ssa_39, ssa_45; if ssa_46 { ... }"
<anarsoul> branch condition can be only a combination of less, equal, more