stikonas_ has quit [Remote host closed the connection]
NeuroScr has quit [Quit: NeuroScr]
stikonas has joined #panfrost
vstehle has quit [Ping timeout: 244 seconds]
stikonas has quit [Remote host closed the connection]
<urjaman>
blöp.
<alyssa>
Blooooop!
<alyssa>
I wonder how hard it would be to RA after scheduling
<alyssa>
It'd require a fair bit of rewrite but since I'm rewriting the RA anyway now's a good time
<alyssa>
The core logic is the same
<alyssa>
The benefit is access to bundle information, the drawback is having to deal with bundles :p
<alyssa>
Goal is being able to use pipeline registers
<alyssa>
Turns out scheduling doesn't need RA information at all,... you want a scheduler minimizing register *pressure*, that has nothing to do with the RA itself
<alyssa>
So this shouldn't be too hard, actually
<HdkR>
It's true
* alyssa
got it
<alyssa>
The RA-after-sched part, I mean. Haven't started pipeline registers
<HdkR>
:D
jernej has joined #panfrost
<alyssa>
Shoot, wait, this'll break CF
<alyssa>
Slightly more effort needed
vstehle has joined #panfrost
<alyssa>
Ok, uh, actual IR improvements will be needed first.
stikonas has joined #panfrost
stikonas_ has joined #panfrost
stikonas has quit [Ping timeout: 264 seconds]
pH5 has joined #panfrost
stikonas_ has quit [Remote host closed the connection]
MoeIcenowy has quit [Quit: ZNC 1.6.5+deb1+deb9u1 - http://znc.in]
MoeIcenowy has joined #panfrost
_whitelogger has joined #panfrost
afaerber has joined #panfrost
anarsoul|c has quit [Quit: Connection closed for inactivity]
MoeIcenowy has quit [Quit: ZNC 1.6.5+deb1+deb9u1 - http://znc.in]
MoeIcenowy has joined #panfrost
afaerber has quit [Quit: Leaving]
unoccupied has joined #panfrost
<tomeu>
alyssa: finally got the wallpaper working, by adding the jobs at the end of panfrost_invalidate_frame
<tomeu>
I guess not all needed state was being correctly saved and restored around the blitter job
<tomeu>
now I need to remove the jobs if we get a clear
<tomeu>
but weston is looking really good :)
<bbrezillon>
tomeu: OOC, what were the problem you fixed?
<bbrezillon>
*problems
<tomeu>
bbrezillon: the only important bit is that the wallpaper jobs are added before any others, and that they are added in a moment where they aren't going to cause unwanted state changes elsewhere
<tomeu>
*bits :)
<bbrezillon>
so, order of creation matters
<bbrezillon>
and ordering things at link time is not enough
<bbrezillon>
and invalidate_frame is called when you swap EGL buffers, right?
<tomeu>
right after we have submitted a frame
<tomeu>
it prepares everything for the next one
<bbrezillon>
ok
<bbrezillon>
I guess you don't know yet what exact step we were messing up when adding the wallpaper job at draw time
<bbrezillon>
s/step/state/
<tomeu>
no, but I think we were saving only gallium state and we also had something panfrost-specific to take care of
chewitt has joined #panfrost
herbmilleriw has joined #panfrost
hlmjr has joined #panfrost
gcl_ has joined #panfrost
herbmilleriw has quit [*.net *.split]
mateo` has quit [*.net *.split]
gcl has quit [*.net *.split]
TheKit has quit [*.net *.split]
Lyude has quit [*.net *.split]
hanetzer has quit [*.net *.split]
urjaman has quit [*.net *.split]
bbrezillon has quit [*.net *.split]
urjaman has joined #panfrost
TheKit has joined #panfrost
mateo` has joined #panfrost
bbrezillon has joined #panfrost
Lyude has joined #panfrost
hlmjr is now known as herbmilleriw
<alyssa>
tomeu: Delicious!
<alyssa>
HdkR: I realized RA after scheduling doesn't work either, since then spilling doesn't work
<alyssa>
The trick (I slept on it, and yes, had bizarre dreams) is to unify the pre-schedule and post-schedule MIR, so that scheduling and RA are both "trial" operations
<alyssa>
So then in the good case we can schedule then RA, but if RA fails, we put RA in "spill ok" mode and run it again until it converges, then schedule again, then peephole create pipeline registers
<alyssa>
In any cases, creating pipeline regs after scheduling is logically its own pass separate from RA. It has to be done before real RA if you want good results (hence why RA is after scheduling at all), but conceptually
<cwabbott>
alyssa: actually, I think the trick is to treat each bundle as a single instruction in RA so that you'll never have to split up an instruction
<cwabbott>
doing that would probably be a Bad Idea (tm) anyways
<alyssa>
cwabbott: Hmm?
<cwabbott>
I was responding to "I realized RA after scheduling doesn't work either, since then spilling doesn't work"
<urjaman>
weird, i had already thought that that might have problems if you need to spill and insert code to deal with that
<cwabbott>
I don't see why it wouldn't work
<alyssa>
Since then it effects the schedule (bundling)?
<alyssa>
*affects
<alyssa>
Although, actually, I guess it doesn't
<alyssa>
Since spilling on mdg is just ld/st and maybe extra moves
<urjaman>
i mean yes it can work but like ... i'm surprised that i'm not 100% stupid :P
<alyssa>
But if you're already going to throw in the towel on performance (spilling), maybe who cares that you add a bunch of extra bundles that "could" have been scheduled away
<cwabbott>
actually you can do arbitrary swizzles/writemasks with load/store instructions, and you don't need a separate instruction to load an offset either
<cwabbott>
which means that you never create anything that needs to be re-bundled after spilling
<alyssa>
Not even a "move work register to offset register (r26/27)"
<alyssa>
?
<cwabbott>
no, i don't think so iirc
* alyssa
checks notes
<alyssa>
Although you're usually right ;P
<cwabbott>
well, it has been a while :p
<cwabbott>
I just remember thinking this was the way to go
<alyssa>
I do see the blob (in a shader that spills) having a single bundle with nothing but "fmov r26, [whatever]" in it
<alyssa>
Although -- ackles, here the blob has one such r26 move in it, but in the same bundle as some random ALU stuff, ...
anarsoul|c has joined #panfrost
<cwabbott>
alyssa: from some quick experiments with compute shaders, it seems the temp store instruction has some kind of offset embedded in the instruction
<Lyude>
is this all midgard btw?
<cwabbott>
yes, all midgard
<Lyude>
ahh, just curious :)
<cwabbott>
when I do something like vec4 temp; temp[2] = ... it doesn't put the 2 offset anywhere
<cwabbott>
although I have to use the old ShaderProgramDisassembler
<cwabbott>
which is probably horribly out of date by now
<cwabbott>
so it doesn't print enough for me to see the offset embedded in the instruction, but it seems it's there
<cwabbott>
with 60 bits per instruction, I'd hope there'd be one!
<cwabbott>
so anyways, the strategy of doing RA after scheduling should work