<thecycoone>
I'll try the kernel patch at lunch. Using distro supplied kernels so far
<thecycoone>
(archlinuxarm's linux-aarch64)
tgall_foo has quit [Ping timeout: 265 seconds]
<robmur01>
alyssa: FWIW a "{key,value}[]" layout would probably be mildly more efficient than "{key[], value[]}", since the former should be able to compile to LDP/STP (LDRD/STRD) to minimise load/store overhead
tgall_foo has joined #panfrost
vstehle has quit [Ping timeout: 258 seconds]
vstehle has joined #panfrost
<alyssa>
robmur01: I thought it'd be the other way, since you can fit 2x the keys in a given cache line/etc, so for searching this should be better?
Xalius has joined #panfrost
<robmur01>
alyssa: I guess it depends mostly on your expected hit rate - if searching (and missing) is overwhelmingly more common than insertion/invalidation, then it may be more beneficial to optimise for
<alyssa>
robmur01: Probably this is all micro-optimizing at this point ;)
<alyssa>
and all of the above is probably still cheaper than a full blown hash table
<robmur01>
but if you get the values for free when reading the keys, hit latency is zero (and the other operations can be effectively twice as efficient)
<alyssa>
(It's certainly easier to manage the memory footprint of)
<robmur01>
micro-optimising is the best kind of optimising :D
gcl has joined #panfrost
yann has joined #panfrost
<robmur01>
maybe I'll give both versions a try and see if there's any observable difference - is there any particular workload that's good to exercise it?
* robmur01
isn't sure of the context of "Aquarium"
<alyssa>
bbrezillon: I recall you fixed a bug a while back (end of 2019) that manifested as vertices going to the origin instead of their current location (so random triangles stretching to the corner of the screen) in ExtremeTuxRacer and others
<alyssa>
Do you remember which patch fixed that / what the issue was?
<bbrezillon>
alyssa: not at all, actually I don't remember fixing something like that :)
<tomeu>
alyssa dreaming again of bug fixes...
<alyssa>
bbrezillon: it was definitely you! :D
<bbrezillon>
alyssa: then I fixed something without knowing :)
<robmur01>
yeah, I remember seeing that, and it clearing up at some point
<robmur01>
but not for any obvious reason
<alyssa>
robmur01: It was definitely bbrezillon :p
<robmur01>
IIRC I was momentarily tempted to bisect it out of pure curiosity
<robmur01>
but... real work :(
shadeslayer has joined #panfrost
<alyssa>
Alas.
<tomeu>
alyssa: has anything changed in how panwrap works? just preloading it doesn't generate any traces
<thecycoone>
bbrezillon: do you have that kernel patch as a raw patch instead of a pastebin?
<thecycoone>
(or a link to pull it out raw)
<thecycoone>
nm, got it
yann has quit [Ping timeout: 258 seconds]
davidlt has quit [Remote host closed the connection]
<thecycoone>
Compiling the kernel on kevin takes awhile eh....
<anarsoul>
thecycoone: try distcc?
<thecycoone>
But it's complicated:p I probably don't need nouveau and everything else the distro includes in their config either
<thecycoone>
Also, my desktop is a clarksdale i3 from 2010. Not going to win any races there either. Need on of those 22 second threadrippers.
<thecycoone>
* Also, my desktop is a clarksdale i3 from 2010. Not going to win any races there either. Need on of those 22 second threadrippers.
<alyssa>
HdkR: ^
<alyssa>
Note to self: the analogue to dirty tracking on Mali would be uploading the postfix descriptors at bind time rather than draw time.
pH5 has quit [Quit: -_-]
<HdkR>
Hm? Turns out that distcc from a ARM device to a threadripper is actually not great. Threadripper is just waiting for the ARM device to preprocess everything :P
<HdkR>
If you can get away with it, cross-compile instead
<HdkR>
or ccache also works great :)
<anarsoul>
HdkR: ccache + distcc works well for me
<HdkR>
I may just be spoiled by how fast 64 hardware threads can compile something
<HdkR>
Although cross-compiling the kernel is pretty straightforward compared to some projects