ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at and - Contact ARM for binary driver support!
calcprogrammer1 has joined #lima
mrueg has quit [Quit: - Chat comfortably. Anywhere.]
mrueg has joined #lima
jrmuizel has joined #lima
_whitelogger has joined #lima
jrmuizel has quit [Remote host closed the connection]
dddddd has quit [Remote host closed the connection]
chewitt has joined #lima
Barada has joined #lima
chewitt has quit [Quit: Zzz..]
chewitt has joined #lima
guillaume_g has joined #lima
chewitt has quit [Remote host closed the connection]
libv_ has joined #lima
libv has quit [Ping timeout: 244 seconds]
libv_ has quit [Ping timeout: 245 seconds]
libv has joined #lima
gaylima has joined #lima
gaylima has quit [Quit: AndroIRC - Android IRC Client ( )]
jorik_ has joined #lima
jonkerj has quit [Ping timeout: 246 seconds]
dddddd has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
Barada has quit [Quit: Barada]
jrmuizel has joined #lima
jorik_ has quit [Read error: Connection reset by peer]
jonkerj has joined #lima
guillaume_g has quit [Quit: Konversation terminated!]
<rellla> anarsoul, enunes: i'm digging through some piglit test and noticed, that glsl-fs-fragcoord-zw-ortho for example fails due to wrong depth. is correct depth implementation missing or do I miss sth?
<enunes> rellla: it also fails for me, I don't know if something is missing exactly. looks like initial depth support is from , other than that I'd check dumps with the blob
<anarsoul> rellla: maybe it misses some knob?
<anarsoul> rellla: IIRC polygon offset isn't implemented but I doubt it should affect depth value
<anarsoul> rellla: can you show test result?
griffinp- has joined #lima
griffinp has quit [Read error: Connection reset by peer]
<rellla> anarsoul: should be and the other fragcoord tests
<anarsoul> rellla: I don't see 1.0/w in disassembly
<rellla> wait
<rellla> right, see nr 2 or 3, the linked one seems to the one where i disabled it temporarily
<anarsoul> what's nr 2 or 3?
<anarsoul> I wonder how it probes pixel depth
<anarsoul> glReadPixels() for depth component. Heh
<anarsoul> rellla: there's an optimization to avoid writing out depth buffer in lima (since we don't really need it)
<anarsoul> rellla: see last line of lima_pack_pp_frame_reg()
<rellla> ok, i'll try it. what does "since we don't really need it" mean?
<anarsoul> rellla: it'll be a waste of memory bandwidth if it's never read. Mali4x0 uses on-chip tile buffer for depth to do depth test
<rellla> so i just need to enable it to pass piglit test?
<anarsoul> yes
<anarsoul> we can add debug flag to force it for piglit
<rellla> is there another way test depth within piglit except gl_ReadPixels then?
<anarsoul> I guess no
<anarsoul> rellla: does it pass with this patch?
<rellla> i will try it
<rellla> pass 3|3
<rellla> :)
<anarsoul> then add a flag to force depth buffer writeout?
<anarsoul> something like LIMA_FORCE_DEPTH_WRITEOUT?
<anarsoul> enabling it by default isn't an option, it significantly affects performance even for glmark2
<rellla> would this be an option for mainline? then i would send a MR...
<anarsoul> I think so
<anarsoul> see how other flags are handled in lima_screen.c
<anarsoul> just add another flag for LIMA_DEBUG
<rellla> i will send a mr tomorrow
<rellla> did i ask you already about ,
<anarsoul> sounds good
<rellla> my precision issues with the derivs?
<rellla> i implemented fddy and fddx and the think they are correct, i just can't pass the tests
<anarsoul> I saw it but I have no idea how it's supposed to work
<anarsoul> try the same shader with offline compiler
<anarsoul> and (if posssible) try the same test with blob
<rellla> the offline compiler result is here and i can't find any difference in the binary.
<rellla> and blob? i have no blob here :p
<rellla> except the difference of two separate instructions of course
<rellla> i think i will also do a mr for it and see what others think about it
<rellla> i addition i added a PIPE_CAP in order to skip nir_lower_wpos_ytransform fot fddy and gl_FragCoord, as i couldn't see offline compiler doing it
<rellla> iirc it also doesn't make piglit tests fail...
<rellla> sry for the typos btw
<anarsoul> rellla: something else could be different
<rellla> i think, i have to check first, if the blob passes the test and then check whats the difference
<rellla> ...running piglit with control-flow branch now ...
jrmuizel has quit [Remote host closed the connection]
<anarsoul> rellla: I wonder how many regressions are there :)
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
<anarsoul> enunes: rellla: I guess we need to introduce a control-flow-aware helper that adds movs properly
<anarsoul> we're adding movs in several places and currently it's a mess
<enunes> anarsoul: I'm a bit out of the loop, what is the awareness required in movs?
<anarsoul> enunes: we don't add deps for a node if they're in different blocks
<anarsoul> it's totally fine since they will be evaluated in correct order anyway
<enunes> not only movs then, but all nodes that we create while lowering?
<anarsoul> yeah
<anarsoul> but I believe they're mostly movs
<anarsoul> constants are fine, we can just clone them. They're free anyway (up to 2 vec4 consts for any instruction)
<enunes> so it is not enough if the branch/jump node has a dep to the next block?
<anarsoul> see lower_select()
<anarsoul> ppir_lower_select()
<anarsoul> ppir_node_foreach_pred {} will do nothing if select condition is in another block
<anarsoul> and then it'll hit assert
<anarsoul> enunes: it's not really about branches
<anarsoul> ssa can be defined in 1st block but used in 2nd
<anarsoul> ppir_node_add_dep() doesn't add any deps if pred and succ are from different blocks
<enunes> I wonder why, I see the condition in ppir_node_add_dep but why can't we just have deps between the different blocks?
<anarsoul> enunes: because block ordering guarantees that it will be evaluated before it's used
<anarsoul> also if you add inter-block deps compiler doesn't respect block boundaries and branch targets become invalid
<anarsoul> especially if there are any nir registers involved