alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
<anarsoul> you can lower them for now if you're not into REing :)
<alyssa> anarsoul: shrug
stikonas has quit [Remote host closed the connection]
vstehle has quit [Ping timeout: 250 seconds]
nerdboy has quit [Ping timeout: 258 seconds]
<tomeu> alyssa: any branches I should rebase?
<tomeu> s/rebase/rebase on/
davidlt has joined #panfrost
_whitelogger has joined #panfrost
warpme_ has quit [Quit: Connection closed for inactivity]
buzzmarshall has quit [Remote host closed the connection]
vstehle has joined #panfrost
nerdboy has joined #panfrost
nerdboy has quit [Ping timeout: 260 seconds]
raster has joined #panfrost
<icecream95> Of course ARM would invent a different way of tiling for AFBC, even though they already have a perfectly good method: https://gitlab.freedesktop.org/icecream95/afbc/-/blob/f7b060ea/tile.c#L31
stikonas has joined #panfrost
nhp_ has joined #panfrost
nhp has quit [Remote host closed the connection]
<narmstrong> icecream95: are you trying to generate AFBC display buffers ?
<narmstrong> oh you try to decode
<narmstrong> decode which afbc format ?
<icecream95> narmstrong: 'which afbc format?' You mean there is more than one format?
<narmstrong> icecream95: ah ah, yeas
<narmstrong> AFBC_FORMAT_MOD_SPLIT, AFBC_FORMAT_MOD_SPARSE, AFBC_FORMAT_MOD_TILED are the main layout options
<icecream95> I've only been looking at single 16x16 superblocks at a time, so I haven't investigated how the superblocks are laid out yet.
icecream95 has quit [Ping timeout: 258 seconds]
warpme_ has joined #panfrost
robmur01_ has joined #panfrost
rhyskidd has quit [Remote host closed the connection]
rhyskidd has joined #panfrost
robmur01 has quit [Ping timeout: 258 seconds]
kaichi has quit [Remote host closed the connection]
buzzmarshall has joined #panfrost
<alyssa> You mean there is more than one format?
<alyssa> *sigh*
<alyssa> tomeu: I mean I've been working on purely compiler fixes but should be orthogonal to your stuff
<tomeu> alyssa: ok, I thought you would be hacking on the sampling operations in parallel, so we can test our stuff with kmscube or glmark2
<tomeu> give me an hour max to finish my emission code, then you can test it all yourself when you get to the compiler
<alyssa> tomeu: Sure :)
tomboy64 has quit [Ping timeout: 240 seconds]
<alyssa> OK, constant rule I see:
<alyssa> Suppose there is a single 60-bit constant (but not a second).
<alyssa> If the top 4-bits are < 8, copy them to the top 4-bits of the second 60-bit constant.
<alyssa> If they are not, the second 60-bit constant remains zero.
<alyssa> What happens with two such constants is TBD.
<tomeu> alyssa: hmm, unsure of how we want to submit to the hw the texture descriptors in pre-allocated BOs, as the postfix has a pointer to an array of descriptors on bifrost
<alyssa> tomeu: Hmm?
<alyssa> Oh.
<tomeu> on midgard, we have an array of pointers, so we can put pointers to our descriptors in the trampoline array
<tomeu> but not on bifrost :/
<alyssa> Right, right, yeah
<alyssa> Not a problem!
<alyssa> Prepack it and put the packed one in CPU (cached) memory in the CSO
<alyssa> so the draw-time overhead is just a very tiny memcpy, with zeor packing logic needed
<alyssa> (On midgard, we have essentially an 8-byte memcpy overhead for the pointers.)
<alyssa> If you want f a n c y, you can do tricks to track when textures get rebound and minimize even that (what bbrezillon had worked on a bit). But diminishing returns at some point
<cwabbott> alyssa: that sounds like a different workaround for the for the upper 4 bits ordering thing
<tomeu> yeah, I for some reason remembered that the pointers to the texture descriptors are stable across draws, but just checked and they are allocating those along the other descriptors in the draw
<cwabbott> iirc the compiler that I was looking at just swapped them
<tomeu> alyssa: I think that, as long as the memcpy implementation is sane for uncached memory, the overhead shouldnt' be that big
<alyssa> cwabbott: Indeed. But (from G52 blob and G31 hardware) -- if I don't do any workaround, it INSTR_INVALID_ENC's. If I swap according to either of your rules, it reads back random data.
<alyssa> So somehow G71 was *saner*...?
<alyssa> tomeu: :+1:
<cwabbott> wait, you're saying there's *different* constant-swapping madness going on?
<alyssa> Yes D:
<alyssa> isn't mali fun
<alyssa> how's adreno over there what with your sane h/w
<cwabbott> well, it's more just different
<cwabbott> the ISA is definitely better
<cwabbott> but they do everything via setting registers, which means you have to do a lot more yourself and there's less automagic optimizations
<alyssa> Fair
<alyssa> the reward for writing drivers for broken old hw is.... writing drivers for broken new hw!
<alyssa> cwabbott: Wait, it's even worse!
<alyssa> they're modifying even
<alyssa> wtf
<alyssa> I hope this is XOR and reversible at least
<alyssa> Er maybe artefact of my test case + const prop
<alyssa> Welp. With the 1-const-only workaround, -bbuffer works
<cwabbott> alyssa: I'd spend some more time hammering out how to handle two constants before moving on
<alyssa> OK.
<cwabbott> at least from my experience, it can really bite
<alyssa> I imagine so yes :(
<HdkR> madness lies within this hardware
cwabbott has quit [Ping timeout: 246 seconds]
cwabbott has joined #panfrost
tbueno has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
nerdboy has joined #panfrost
nerdboy has quit [Ping timeout: 240 seconds]
MastaG has quit [Ping timeout: 265 seconds]
MastaG has joined #panfrost
davidlt has quit [Ping timeout: 264 seconds]
raster has quit [Quit: Gettin' stinky!]
icecream95 has joined #panfrost
Ntemis has joined #panfrost
kaichi has joined #panfrost
raster has joined #panfrost
Ntemis has quit [Remote host closed the connection]
nerdboy has joined #panfrost
stikonas has quit [Remote host closed the connection]