<anarsoul>
cwabbott: looks like standalone compiler produces totally different result from the driver in some cases
<anarsoul>
i.e. don't see array reg (i.e. decl_reg vec4 32 r0[8]) in driver when running glsl-fs-vec4-indexing-temp-src-in-loop
<anarsoul>
but it's here in standalone compiler
<anarsoul>
any ideas why?
<anarsoul>
i.e. nir representation of glsl shader is different
sunxi_fan has quit [Quit: Leaving.]
libv_ has joined #lima
monstr has quit [Remote host closed the connection]
<anarsoul>
enunes: regressions that I saw yesterday came from your vectorization changes
<anarsoul>
e.g. glsl-fs-vec4-indexing-temp-src-in-loop now lowers fcsel ssa_32.xxxx, ssa_17, ssa_16 into 4 fcsels
<enunes>
yeah thats expected
libv has quit [Ping timeout: 272 seconds]
<anarsoul>
yeah, but now it fails in regalloc
<enunes>
so same problem we have to fix for ideas?
<anarsoul>
are you planning to address it? it should be easy to detect in lower_to_scalar pass, just add an option to do that
<anarsoul>
unlikely
<anarsoul>
looks like there's no selects in this shader
<enunes>
I see it more as a feature and maybe it would fix the particular problem, but for some reason we are ending up with regalloc very difficult to solve in general
<enunes>
I can work on it after the spilling improvements though
<enunes>
the vec fcsel with scalar condition
<anarsoul>
maybe it's not relevant atm
<anarsoul>
I'm not sure how to fix spilling though
<enunes>
I don't think spilling is the problem
<anarsoul>
in ideas?
<enunes>
yes, with the latest changes we basically remove all instruction creation, and it still doesn't work
<enunes>
this is even without my spilling improvements, although it seems that spilling is much less required with this change
<anarsoul>
looks like it hurt some shaders
<anarsoul>
hm
<enunes>
pretty hard to have 0 hurts
<anarsoul>
true
<enunes>
I guess I can take my changes out of WIP now
<anarsoul>
btw, can you share your shader-db shaders somewhere? maybe start a git repo for them?
<enunes>
I can, but they would get outdated quickly, it's pretty easy to generate the list
<enunes>
basically run piglit once with MESA_SHADER_CAPTURE_PATH=$PWD/somewhere
<anarsoul>
how do you filter out those that fail?
<enunes>
they just fail in shader-db if it crashes, if it fails in runtime shader-db doesn't know about it
<enunes>
I just move them from the captured shaders repo to a 'disabled' repo if they crash
<enunes>
and it's not a regression I introduced
libv_ is now known as libv
<anarsoul>
enunes: I think we should also favor spilling SSAs over regs
<enunes>
yeah I thought about that but it would probably provide very marginal gain again
<anarsoul>
yet it should improve shaders with a lot of regs
<anarsoul>
it's beneficial to spill one vec4 ssa, it essentially turns it into uniform
<enunes>
I can give it a try
<anarsoul>
which can be inserted anywhere
<enunes>
though at this point it's very empirical to come up with costs
<anarsoul>
yeah
<enunes>
say I multiply regs to 1.5 for example, and then it decides to spill a ssa with less components and more uses, then it hurts some of the cases
<anarsoul>
yeah
<anarsoul>
I think we should write it down somewhere
<anarsoul>
so it can be analyzed better
<anarsoul>
probably a comment in regalloc.c with explanation what affects spilling cost?
<enunes>
honestly so far it's all empirical, I try multiple values and pick whatever seems to be better
<enunes>
but once we start getting to 3+ variables, not even sure how to define them
<anarsoul>
I understand, but it would be nice to have something like:
<enunes>
maybe what we can have is a deciding benchmark, like the glamor shader
<anarsoul>
spilling cost: vec1: 4, vec2: 2, vec3: 3, vec4: 1. Modifiers: used in instructions with uniform slot taken: 1.1x, used in instructions with store slot taken: 1.1x, register: 1.1x
<enunes>
and optimize for that even if it hurts some unimportant piglit shaders
<anarsoul>
do we spill in glamor shaders?
<enunes>
right now the base cost is actually 1/(num_components), I tried to made that (5.0 - num_components) and that hurt more shaders than helped
<anarsoul>
it used to be 4.0f/num_components
<enunes>
oh yeah sorry 4.0
<enunes>
I thought we do? I still didn't include glamor in my list actually