stikonas has quit [Remote host closed the connection]
warpme_ has quit [Quit: Connection closed for inactivity]
kaichi has joined #panfrost
nerdboy has quit [Ping timeout: 264 seconds]
Elpaulo has quit [Read error: Connection reset by peer]
Elpaulo has joined #panfrost
davidlt has joined #panfrost
<icecream95>
Maybe a mega-unrolled loop filling a whole 16x16 block per iteration wasn't such a good idea: 0.12 insn per cycle
<HdkR>
yea, unrolling isn't always a win
_whitelogger has joined #panfrost
<icecream95>
HdkR: I think it's more that the cache can't handle reading from 16 rows of the image at once
warpme_ has joined #panfrost
<icecream95>
HdkR: It runs 20% faster if I reduce the image width from 4096 to 4080
buzzmarshall has quit [Remote host closed the connection]
<HdkR>
Considering you can saturate the loadstore pipelines with like 2-4 paired loadstore ops, you don't need to unroll very far :P
Elpaulo has quit [Read error: Connection reset by peer]
Elpaulo has joined #panfrost
<chewitt>
@alyssa nice article on the collabora blog :)
<chewitt>
esp. the pretty pictures, which is more my level of comprehension
<daniels>
'ooh, cats!'
stikonas has joined #panfrost
<kaichi>
alyssa: further compliments to the article, quite extensive explanations there
<kaichi>
and big congrats on the google award
kaichi has quit [Quit: Leaving]
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
raster has joined #panfrost
NeuroScr has quit [Quit: NeuroScr]
Ntemis has joined #panfrost
<Ntemis>
@alyssa there is a grammar mistake in the blog "promise high-performance in theory, and in theory" should be "promise high-performance in practice, and in theory"
<Ntemis>
or better "in theory, and in practice"
<urjaman>
you're copying only a part of the compound sentence but like i parsed the thing in the blog just fine
<urjaman>
(promise high performance in theory) + (in theory (theory == practice))
<urjaman>
anyways what i'm trying to say it's not an accident that it has "in theory, and in theory" as a part of the sentence, just that the "and in theory" referers to the following part
<urjaman>
Ntemis: ^^
<urjaman>
i did think it was funny tho :D
<Ntemis>
hum you are right
Ntemis has quit [Remote host closed the connection]
icecream95 has quit [Ping timeout: 250 seconds]
mixfix41 has quit [Quit: leaving]
raster has quit [Quit: Gettin' stinky!]
buzzmarshall has joined #panfrost
raster has joined #panfrost
<alyssa>
chewitt: thank you!
<alyssa>
karolherbst: thank you!
<karolherbst>
huh? what have I done :D
<urjaman>
have a nick starting with "ka" (and kaichi left)
<karolherbst>
ahhh
Green has joined #panfrost
nerdboy has joined #panfrost
<alyssa>
cwabbott: Any idea why SWZ.* opcodes exist?
<alyssa>
Functionally identical to SEL.* with the port reused, same scheduling, no more ports needed, ...
cwabbott has quit [Ping timeout: 272 seconds]
cwabbott has joined #panfrost
<alyssa>
What the frost?
<alyssa>
Looking at some OpenCL stuff to get a feel for int8, I see FMA extended opcode 206c
<alyssa>
which appears to do something like
<alyssa>
"Select 8-bit from src0, convert 32-bit src1/2/3 to 8-bit, and then do a select'
<HdkR>
huh
<alyssa>
2166 seems like that but first two 8, second to 32->8
davidlt has quit [Ping timeout: 265 seconds]
<alyssa>
0168 seems to select .Y from 8-bit src0 and swizzle it down to .X
<alyssa>
actually since 4src 206c --> 2000
<HdkR>
It is not actually a bitfield extract? :P
<alyssa>
wait
<alyssa>
both are opcodes 2000
<alyssa>
HdkR: derp :)
<alyssa>
so it's just SEL.XXXX.v2i8
<alyssa>
er
<alyssa>
so it's just SEL.XXXX.v4i8
<alyssa>
(e02000 I mean)
<alyssa>
the next case implies we don't have SEL.XYXX since we need an explicit Y->X instruction
<alyssa>
e02400 is SEL.XZXX
<alyssa>
2600 is SEL.ZZXX
<alyssa>
2800 is SEL.XXZX
<alyssa>
3000 is SEL.XXXZ
<alyssa>
OKay okay so it's one big SEL op
Elpaulo has quit [Quit: Elpaulo]
adjtm_ has joined #panfrost
adjtm has quit [Ping timeout: 256 seconds]
adjtm_ has quit [Remote host closed the connection]
adjtm has joined #panfrost
cwabbott has quit [Ping timeout: 250 seconds]
<HdkR>
alyssa: So patches accepted for a Vulkan driver you say? :P
anarsoul|2 is now known as anarsoul
<Lyude>
HdkR: you mean for midgard?
<HdkR>
You're right Bifrost Vulkan would be more interesting :P
NeuroScr has joined #panfrost
<alyssa>
HdkR: if you write them I'll merge them ;P
<Lyude>
narmstrong: poke, you around? it's been a while but I was wondering if you could help me get a bootloader onto the odroid n2 to do efi booting, since I might actually be getting a little bit of time to work on panfrost every now and then (also need it so I can debug igt-gpu-tools builds for fedora on aarch64, helped me justify spending time to set this up)
<Lyude>
(if you're wondering, covid seems to be freeing some time up for me finally)
<Lyude>
alyssa: btw, is vulkan on midgard actually possible?
* Lyude
can't remember if she asked this already
<HdkR>
Later generation midgard can do Vulkan
<HdkR>
2nd gen+
<HdkR>
and it's only 1.0 there I believe. Not sure if it could do 1.1+
<alyssa>
Lyude: probably with hacks, at least on t760+ there are blobs
nlhowell has quit [Ping timeout: 256 seconds]
raster has quit [Read error: Connection reset by peer]