stikonas has quit [Remote host closed the connection]
rellla has quit [Ping timeout: 256 seconds]
warpme_ has quit [Quit: Connection closed for inactivity]
rellla has joined #panfrost
davidlt has joined #panfrost
davidlt_ has joined #panfrost
davidlt has quit [Ping timeout: 268 seconds]
apol has joined #panfrost
apol has quit [Client Quit]
davidlt_ is now known as davidlt
<HdkR>
The Khadas Vim3L has a Linux image for it right?
<HdkR>
or is it VIM3L
vstehle has quit [Ping timeout: 256 seconds]
bbrezillon has quit [Ping timeout: 240 seconds]
bbrezillon has joined #panfrost
icecream95 has joined #panfrost
nerdboy has quit [Ping timeout: 256 seconds]
<icecream95>
alyssa: ushort is all I've actually seen in practice for indices, so I don't think that is much of a problem.
<icecream95>
alyssa: "from the asm I'm not convinced": With -O3, the asm was just a single instruction for each of: load, min, max, store, compare and jump
mixfix41 has left #panfrost [#panfrost]
<icecream95>
(with NEON, so 8 elements at a time)
<HdkR>
Wouldn't you want to unroll at least two steps to ensure pipeline saturation?
<HdkR>
Especially on big Cortex where it has two 128bit neon pipelines
<icecream95>
At least on Cortex-A17, loop unrolling actually hurt performance...
<HdkR>
Real Cortex-A17 or the rebranded Cortex-A12 before it died? :P
<icecream95>
(or at least didn't improve much compared to alignment. I was testing hand-written asm and didn't insert alignment directives...)
<icecream95>
RK3288, so A12
<HdkR>
I've never actually tried finding the pipeline layout of that CPU, so sad that it lowered perf
<HdkR>
Cortex-A57 and higher should get a perf increase
buzzmarshall has quit [Remote host closed the connection]
davidlt has quit [Ping timeout: 258 seconds]
davidlt has joined #panfrost
Elpaulo has joined #panfrost
vstehle has joined #panfrost
QwertyChouskie has joined #panfrost
QwertyChouskie has quit [Ping timeout: 256 seconds]
QwertyChouskie has joined #panfrost
pH5 has joined #panfrost
QwertyChouskie has quit [Ping timeout: 256 seconds]
mias has joined #panfrost
karolherbst has quit [Quit: duh 🐧]
tgall_foo has quit [Read error: Connection reset by peer]
<robmur01_>
HdkR: as the rebranding announcements alluded to at the time, the pipeline improvements from A17 (nee A12 r1p0) were backported to A12 r0p1 (as found in RK3288), so they really are functionally equivalent :)
<robmur01_>
the TRM says it has a loop buffer, so loops small enough probably should perform better than ones unrolled to be too big to fit
robmur01_ is now known as robmur01
<HdkR>
robmur01: Ah, that's actually nice to know, I never checked which revision the RK3288 was running to know
<HdkR>
I completely forgot about that loop buffer :D
kaspter has quit [Quit: kaspter]
<alyssa>
anarsoul: The idea (unconfirmed) was that by doing the load only once, it stays in a faster cache than if you do the entire upload_index_buffer -- while it is probably still cached somewhere it'll be bumped down to a slower hierarchy level by then just given the size and time-distance, I'd assume
<alyssa>
I'll be one of the keynote speakers at LibrePlanet this year, y'all should tune into the livestream :)
<alyssa>
[/plug]
<alyssa>
[shame]
raster has joined #panfrost
gcl_ has joined #panfrost
gcl has quit [Ping timeout: 268 seconds]
<alyssa>
tomeu: I'm at 42 commits right now, should I sync with upstream you think? :p
<tomeu>
alyssa: I think so, otherwise you run the risk of becoming upstream :p
<alyssa>
:P
<alyssa>
Right now the IR I've laid out supports everything big I can think of (most ALU ops, different typesizes, branching...) with the notable exception of texturing
<alyssa>
but texturing is very isolated and will be added shortly so I'm not worried there