#videocore on 2021-04-30 — irc logs at freenode.irclog.whitequark.org

2021-02-26 10:57 nroberts changed the topic of #videocore to: Raspberry Pi Mesa drivers discussion - Logs http://freenode.irclog.whitequark.org/videocore

02:47 jcea has quit [Ping timeout: 276 seconds]

05:26 itoral has joined #videocore

08:29 apinheiro has joined #videocore

10:32 <itoral> apinheiro: it seems that v3d was getting loop unrolling from mesa_st, but for vulkan we need to add that if we want

10:32 <itoral> and I guess we do

10:33 <itoral> I'll check if that makes any difference

10:39 <apinheiro[m]> itoral: ok

11:12 <itoral> apinheiro: it seems to make a difference in some cases

11:12 <itoral> Shooter is very slightly improved

11:13 <itoral> the Bloom demo from sascha willems is up almost 20% :)

11:13 <itoral> from 41fps to 50fps

11:13 <itoral> well, that is more than 20% actually

11:14 <apinheiro[m]> oh nice (the bloom one)

11:16 jcea has joined #videocore

11:25 itoral has quit [Remote host closed the connection]

11:25 itoral has joined #videocore

11:35 <itoral> mmm... sometimes it causes perf to regress though, I guess I should try to figure out why

11:40 itoral has quit [Remote host closed the connection]

11:40 itoral has joined #videocore

11:42 itoral has quit [Remote host closed the connection]

11:42 itoral has joined #videocore

11:45 <itoral> well, unrolling can increase register pressure and we end up spilling, that's why

11:46 <itoral> I guess I could add another compiler strategy for this... or just try to play with the max unrolls or something like that

11:49 <itoral> he, with the offscreen demo from sascha:

11:49 <itoral> no unroll: 16fps

11:49 <itoral> max unroll 32 iterations: 9 fps (tons of spilling)

11:49 <itoral> max unroll 16 iterations: 33 fps (no spilling)

11:50 <itoral> so we get +50% or -50% performance depending on whether unroll without causing spills

11:50 <itoral> apinheiro ^

11:52 <apinheiro[m]> itoral: well, I guess that we could also start to add heuristics for the strategies

11:53 <apinheiro[m]> so far we assume that the first ones would have better performance

11:53 <apinheiro[m]> well, or something more simple

11:53 <apinheiro[m]> add more "discard reasons"

11:53 <itoral> that's what we are already doing

11:53 <apinheiro[m]> right now the discard-strategy reason is "no register allocation"

11:53 <itoral> we just need to decide if we want to add another one for this

11:54 <apinheiro[m]> we could try to add also "this strategy is spilling"

11:54 <apinheiro[m]> <itoral "we just need to decide if we wan"> well, yes, but what I mean is that if we add another strategy

11:54 <itoral> no

11:54 <apinheiro[m]> something like "no rolls"

11:54 <itoral> we avoid spills with strategies

11:54 <apinheiro[m]> perhaps we want to discard the "with rolls" strategies if we have spilling

11:54 <itoral> at least with most of them

11:55 <itoral> right, that was my point

11:55 <itoral> about adding a strategy

11:55 <itoral> we never allow spills with any strategy at the moment

11:55 <itoral> or at least that is what I recall

11:55 <itoral> but maybe it is not obvious

11:56 <itoral> because that is tied to the fact that we only allow spilling when we drop to 2 threads

11:56 <itoral> and all our strategies disallow that

11:57 <apinheiro[m]> oh, sorry then I misremembered

11:57 <apinheiro[m]> so then we only allow spilling on the last strategy?

11:57 <itoral> well, more precisely "all optimizations enabled" strategy won't spill

11:57 <itoral> in any case

11:58 <itoral> but the do others allow spilling

11:58 <itoral> because they are not supposed to increase register pressure significantly

11:59 <itoral> anyway, I have to do some testing and think what we want to do for loops

12:00 <itoral> the problem with adding more and more strategies is that we can end up with really long compilation times

12:00 <itoral> but I guess in this case it would only affect shaders with unrolled loops, which are not plentiful

12:00 <itoral> so maybe it is not that big of an issue

12:00 <itoral> if we do it right

12:03 <itoral> it seems that 16 iterations avoids all the problems while keeping the performance gains in all Sascha demos

12:04 <apinheiro[m]> <itoral "the problem with adding more and"> well, but just in the worse case, right?

12:04 <apinheiro[m]> in any case, it would be interesting to check

12:04 <itoral> so maybe we should go with that and add a fallback strategy just in case that only kicks in if the shader had any unrolled loops

12:04 <apinheiro[m]> perhaps doing a shader-db with some extra info

12:04 <apinheiro[m]> of how many shaders need to rely to the worse strategy

12:04 <itoral> yes, only on the worse case

12:04 <itoral> but those cases do exist

12:05 <itoral> it is not about how many

12:05 <itoral> if it is just one

12:05 <itoral> and it happens in a gameplay loop

12:05 <itoral> it is very noticeable

12:05 <itoral> you already know that :)

12:05 <itoral> I guess it is not something too critical, but it is not irrelevant either

12:08 <apinheiro[m]> yes, and although we could blame vulkan users if they don't use caches, for opengl users is more tricky

12:18 itoral has quit [Read error: Connection reset by peer]

12:19 itoral has joined #videocore

12:41 itoral has quit [Quit: Leaving]

13:58 luckyxxl has joined #videocore

16:27 luckyxxl has quit [Quit: bye]

18:04 apinheiro has quit [*.net *.split]

18:07 apinheiro has joined #videocore

18:07 nroberts has quit [Ping timeout: 240 seconds]

18:09 nroberts has joined #videocore

19:09 txenoo has quit [Quit: Leaving]

22:20 jcea has quit [Quit: jcea]

22:20 jcea1 has joined #videocore

22:22 jcea1 is now known as jcea

23:10 apinheiro has quit [Ping timeout: 252 seconds]