<rellla>
for now i have no clue, how to make the 2 tests pass.
<rellla>
would be cool, if someone with a blob setup can check, if these 2 tests pass with the blob!?
Wizzup has quit [Ping timeout: 272 seconds]
<enunes>
rellla: that piglit summary doesn't have the patch that introduces the movs, right?
<enunes>
I don't see any movs inserted in glsl-derivs-abs-sign
guillaume_g has left #lima ["Konversation terminated!"]
<rellla>
glsl-derivs is with the movs, glsl-derivs_wo is without
<rellla>
there are 2 tests in the summary
<enunes>
ah ok
<rellla>
corresponding with HEAD and HEAD^1
Wizzup has joined #lima
<enunes>
rellla: can't spot an obvious bug, two things that call my attention are why is it using .xy when we are only dealing with scalars (num_components should be 1 and write_mask 0x01?), and in the last dfdy the negative source switches from src 0 to src 1 as it is re-negated in the mov, does that work or is it mandatory to have the negative in dfdy?
<enunes>
looking at glsl-derivs-abs-sign
<enunes>
mandatory I mean, first src negative and second one positive
<enunes>
if the second soruce is the one that must be negative then I think it is expected though
<rellla>
changing the neg modifier makes all tests fail.
<rellla>
regarding the write_mask i did many tests ended up with this version. we cannot hardcode it to 1 and 0x01, can't we?
<enunes>
rellla: we shouldn't have to, and I think it's unlikely to be the reason for why it's failing, but it got me curious as to why it ended up with 2 components or that write mask
Wizzup has quit [Ping timeout: 272 seconds]
<rellla>
i struggled with swizzles and write_mask and therefore added the commented ppir_debug's to see whats happening...
<enunes>
for example that means that dfdx is arriving there with alu->src[0].ssa->num_components == 2? I wonder if that's correct
<rellla>
wait a minute...
Wizzup has joined #lima
<rellla>
testing it with debug enabled, but yes, num_components should be 2 because of the vec2 varying load.v $1.xy 0.xy
<cwabbott>
the optimal thing to do kinda depends, but if you want to do something simple, you can replicate the swizzle and writemask in the move, then set the derivative swizzle to the identity
<cwabbott>
and then the temporary register has the same number of components as the destination register
<rellla>
imho the implementation can be optimized, but the remaining difference in the values seems to be another issue for me...
<enunes>
it's not only for optimization, I would try to hardcode those to 1 to see if it makes any difference if it's a quick test
yuq825 has quit [Remote host closed the connection]
jbrown has quit [Remote host closed the connection]