alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
<robmur01> Either way I suspect "designed to support OpenGL 4.2" is probably more like "designed to support DX11 and somebody got carried away cross-referencing standards while writing tables of texture formats" :P
<alyssa> robmur01: :P
* alyssa has been figuring out how to schedule Bifrost.
bnieuwenhuizen_ has joined #panfrost
bnieuwenhuizen_ is now known as bnieuwenhuizen
adjtm has quit [Remote host closed the connection]
adjtm has joined #panfrost
<alyssa> ..Okay, Bifrost has ceased to make any sense to my brain
<HdkR> Oh no
<alyssa> there is no dependency between these two clauses
<alyssa> but there is an implicit RAW dep
<alyssa> what gives
<alyssa> also, looks like bundles with STORE in it can't read anything from the reg file. so that's fun.
<alyssa> Perhaps there is distinction between hi- and low-latency deps
<alyssa> or loads vs stores
<alyssa> maybe there is implicitly no concurrency across stores
<HdkR> high latency could potentially cause a full stall and is known done once the bundle completes?
<HdkR> So most memory loads?
<alyssa> HdkR: In that case there's no reason to track dependencies at all.
<alyssa> The arch explicitly allows you to do
<alyssa> { load } { load } { something with both loads }
<alyssa> with deps (), (), (0, 1)
<alyssa> and then the hw can execute the loads in either order (so "effectively" concurrent)
<HdkR> I feel like that is sane
<alyssa> I do too
<alyssa> But I'm trying to understand how a load dep is conceptually different from a store dep
<alyssa> HdkR: Ohh, derp, okay, yeah.
<alyssa> I've been thinking about clauses all wrong.
<HdkR> :D
<alyssa> AFAICT, clauses can't execute out-of-order or in parallel or anything
<alyssa> Rather, you can think of taking all the clauses in source order and just running through them linear
<alyssa> So conceptually the hardware just runs that with completely 0 latency, just instruction after instruction in linear order.
<HdkR> That's my understanding as long as they don't do any voodoo hardware scheduling behind your back. Which would be invisible to the isa encoding anyway
<alyssa> Dependencies just act as barriers to ensure previous instructions terminated (so it's safe to read their values)
<alyssa> Right, the implication is that
<alyssa> { R0 = foo, R1 = bar, STORE R0 } { STORE R1 }
<alyssa> ^ those clauses don't need any dependencies, since R1 is gauranteed to be written even without a dep flag
<alyssa> the dep flag would force the first store to complete, which is not necessary. So you can still get the stores in parallel.
<alyssa> So this simplifies a lot, tb.
<alyssa> *tbh
<alyssa> I was nervous RA would have to deal with concurrency issues but it sounds like that's not the case at all, concurrency can *only* happen with the hi-latency ops, there's no clause-level concurrency of ALUs
<alyssa> to think I read "Dependency tracking" in Connor's notes like four times and still had that misconception, sigh
<HdkR> Right, so as long as that assignment of R1 wasn't a long latency op `{R1 = Load}, {Store R1}` sort of thing, then you theoretically don't need a dep between the clauses is what you're positing
<HdkR> Sounds correct
<alyssa> Right, yeah
nerdboy has quit [Ping timeout: 260 seconds]
stikonas has quit [Remote host closed the connection]
<alyssa> Regardless, scheduler is definitely the one laying out clauses.
<alyssa> And in one stage, not a two-stage inter-clause/intra-clause schedule
<alyssa> Which can be.. accomodated, with some refactor
<alyssa> And solves my branching woes nicely :)
<alyssa> So packing ends up being dumb =D
<HdkR> woo
<alyssa> Hopefully I can get branching going tomorrow then
vstehle has quit [Ping timeout: 246 seconds]
icecream95 has quit [Ping timeout: 246 seconds]
icecream95 has joined #panfrost
<HdkR> Lyude: btw, do you know if Dell's latest Thunderbolt dock has any glaring issues under Linux? https://www.dell.com/en-us/shop/dell-thunderbolt-dock-wd19tb/apd/210-arik/pc-accessories
<HdkR> (Also find it strange that they claim up to four displays can be run off the thing)
<Lyude> HdkR: it shouldn't, I'll fix any you find
<Lyude> (that's one of the ones I support for my job)
<HdkR> Awesome. I just glued it to my working desk. My Laptop will be arriving in a couple days to play with it
<HdkR> Well, double side sticky taped, but about the same when it's VHB
Cruft has joined #panfrost
NeuroScr has quit [Quit: NeuroScr]
buzzmarshall has quit [Remote host closed the connection]
davidlt has joined #panfrost
vstehle has joined #panfrost
NeuroScr has joined #panfrost
Cruft has quit [Quit: Leaving]
robmur01_ has joined #panfrost
vstehle has quit [*.net *.split]
Lyude has quit [*.net *.split]
robmur01 has quit [*.net *.split]
xdarklight has quit [*.net *.split]
cowsay has quit [*.net *.split]
tlwoerner has quit [*.net *.split]
nlhowell has quit [*.net *.split]
robmur01_ is now known as robmur01
nlhowell has joined #panfrost
vstehle has joined #panfrost
tlwoerner has joined #panfrost
xdarklight has joined #panfrost
cowsay has joined #panfrost
Lyude has joined #panfrost
icecrea105 has joined #panfrost
icecream95 has quit [Ping timeout: 246 seconds]
icecrea105 has quit [Quit: leaving]
icecream95 has joined #panfrost
icecrea105 has joined #panfrost
icecream95 has quit [Ping timeout: 260 seconds]
icecrea105 has quit [Quit: leaving]
stikonas has joined #panfrost
raster has joined #panfrost
Prf_Jakob has quit [Ping timeout: 246 seconds]
embed-3d has quit [Ping timeout: 246 seconds]
embed-3d has joined #panfrost
Prf_Jakob has joined #panfrost
sphalerite has quit [Quit: rebooooooooot]
sphalerite has joined #panfrost
sphalerite has quit [Client Quit]
sphalerite has joined #panfrost
indy has quit [Ping timeout: 260 seconds]
indy has joined #panfrost
indy has quit [Ping timeout: 264 seconds]
indy has joined #panfrost
buzzmarshall has joined #panfrost
cwabbott has quit [Ping timeout: 272 seconds]
cwabbott_ has joined #panfrost
cwabbott_ is now known as cwabbott
nerdboy has joined #panfrost
adjtm has quit [Read error: Connection reset by peer]
adjtm has joined #panfrost
nerdboy has quit [Ping timeout: 246 seconds]
nerdboy has joined #panfrost
anarsoul has quit [Remote host closed the connection]
<alyssa> # unknown pos 0xe
<alyssa> cwabbott: ^ :D
raster has quit [Quit: Gettin' stinky!]
<alyssa> 6 constants, 6 instructions in a clause
<Lyude> o:
anarsoul has joined #panfrost
<Lyude> that's a first
<alyssa> so (6, 4) in the table, I guess?
nerdboy has quit [Ping timeout: 260 seconds]
<alyssa> or (6, 5) maybe
<alyssa> uh wait
<alyssa> const0....const9
davidlt has quit [Ping timeout: 256 seconds]
mixfix41 has quit [Ping timeout: 256 seconds]
gcl_ has quit [Ping timeout: 260 seconds]
<alyssa> (6, 4) I guess
gcl has joined #panfrost
* alyssa has confused herself
<alyssa> It's (6, 5)
<alyssa> So 6 ins clause has 7 64-bit constants, 6 in use here
<alyssa> Not sure why it refuses to use the 6th constant
<alyssa> with 7 instructions, I mean
<alyssa> wait what
<alyssa> Now it's imagining instructions that don't exist
<alyssa> 6 constants, 7 instructions is okay
<alyssa> via 0xd I guess
<alyssa> But it refuses to do 7 constants, 7 instructions
<alyssa> (or 7 constants 8 instructions)
<alyssa> So the missing combinations are:
<alyssa> 6 instructions, 6 constants
<alyssa> 7 instructions, 7 constants
<alyssa> 8 instructions, 6 constants
<alyssa> 8 instructions, 7 constants
<alyssa> 8 instructions, 8 constants
<alyssa> Presumably `pos = 0xf` is one of those
<alyssa> Oh, but the first one is the 0xe I have above
<alyssa> So one of the latter 4
<alyssa> But it's refusing to do 8 instructions, 6 constants too. So not sure why 0xf is.
<alyssa> It's happy to do 8, 5
<alyssa> Assuming 0xf doesn't exist, this actually satisfies a really nice invariant
<alyssa> constant_count + instruction_count <= 13
<alyssa> No clue why 13 :p
<Lyude> alyssa: you're not on the scheduler yet are you? (curious what's got you playing around with multiple instructions per clause)
<alyssa> Lyude: Not exactly, but I'm paying serious thought to figuring out how I'll want to write it.
<alyssa> So I don't paint myself into corners now
<alyssa> Why now? I'd like to implement branching, which depends on calculating the sizes of clauses before packing. So that's related to scheduler data structures.
<Lyude> ah, cool
icecream95 has joined #panfrost
<alyssa> cwabbott: Re the above invariant - the limit then works out to be "no more than 8 quadwords per clause"
<alyssa> Which is very reasonable.
<alyssa> Per the packing algo: (8, 5), (7, 6), (6, 7) -- the borderline cases with sum of 13 -- work out to exactly 8 quadwords
<alyssa> Anymore and you can't pack, any less and there's no more quadwords anyway. So 8.
Marex has quit [Ping timeout: 260 seconds]
Marex has joined #panfrost
<alyssa> hm what happens with 4 instructions
<alyssa> oh, format #3.3 has an end clause, ok
* alyssa can't get the blob to fuse any conditions into branches
<alyssa> with float, I mean
<alyssa> And when it fuses with int the disasm doesn't agree.
<alyssa> Will stick to unfused for now.
<alyssa> Oh wait, it does agree, it's just an inverted branch
gcl_ has joined #panfrost
gcl has quit [Ping timeout: 272 seconds]
rando25902 has joined #panfrost
<robmur01> :q
* robmur01 struggles to keep track of which of the 3 keyboards here is which...