alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
<alyssa> daniels: fwiw, from the profile I see the tile-aligned store as a bottleneck still.
<alyssa> It turns out memory is slow.
* alyssa will look into an impl of the above as well just to scope it out for forst
<alyssa> *frost
stikonas has quit [Remote host closed the connection]
icecream95 has joined #panfrost
<icecream95> alyssa: perf isn't accurate to single instructions, especially for vector instructions.
<icecream95> For me, perf shows the branch as taking 60% of the time for min/max calculation
<HdkR> xray is also good for a different style of profiling :D
Space_Man has quit [Remote host closed the connection]
<icecream95> When I optimised min/max a couple of months ago, I saw the function drop to half what it was in profiles, so definitely not load bottlenecked
<HdkR> There's a branch in the min/max calculation?
<icecream95> Yes, to get back to the top of the loop.
<HdkR> Ah, I thought you meant for the selection, derp
<alyssa> icecream95: Alright, that's definitely good to know, thank you.
<alyssa> "When I optimised min/max a couple of months ago, I saw the function drop to half what it was in profiles, so definitely not load bottlenecked"
<alyssa> I don't believe these are mutually exclsuive.
<alyssa> Well.
<alyssa> The vector load vs a scalar load *might* make a difference.
<alyssa> On Midgard it certainly would (and we would still say you're "load bottlenecked" even if that's not strictly totally true). I'm not familair enough with CPU architectures to know how that interacts with NEON/etc
<alyssa> anarsoul: see gitlab re subdata :|
<icecream95> I removed half the vector instructions, and it got almost twice as fast, so obviously the load isn't a massive bottleneck
<alyssa> Interesting, thank you.
<alyssa> I imagine the Gallium-surgery option discussed the other day might be a win anyway, depending how it's done.
<icecream95> That said, the speed-up was only about 80% of what it should have been, so maybe the loading is a slight bottleneck
<alyssa> Alright
vstehle has quit [Ping timeout: 265 seconds]
<icecream95> STK with gles3 (but not "Advanced pipeline") gives me this on t760: https://gitlab.freedesktop.org/snippets/876
<HdkR> It lives!
nerdboy has quit [Ping timeout: 268 seconds]
nerdboy has joined #panfrost
<icecream95> /usr/lib/xscreensaver/goop causes XWayland to spend a lot of time in convert_ubyte swizzling textures
<icecream95> This function shouldn't even be called at all as there is perfectly functional swizzle support in hardware...
nerdboy has quit [Ping timeout: 265 seconds]
nerdboy has joined #panfrost
davidlt_ has joined #panfrost
<icecream95> It looks like the swizzling is being done in a call to glReadPixels
<HdkR> Makes sense, glReadPixels is a rude operation
Depau has quit [Quit: ZNC 1.7.5 - https://znc.in]
Depau has joined #panfrost
Depau has quit [Client Quit]
Depau has joined #panfrost
Depau has quit [Client Quit]
Depau has joined #panfrost
megi has quit [Ping timeout: 268 seconds]
vstehle has joined #panfrost
rhyskidd has joined #panfrost
buzzmarshall has quit [Remote host closed the connection]
xdarklight_ has joined #panfrost
xdarklight has quit [Ping timeout: 246 seconds]
guillaume_g has joined #panfrost
pH5 has joined #panfrost
adjtm_ has quit [Ping timeout: 260 seconds]
mixfix41 has joined #panfrost
mixfix41 has left #panfrost [#panfrost]
davidlt_ is now known as davidlt
mixfix41 has joined #panfrost
<icecream95> /usr/lib/xscreensaver/blitspin causes Xwayland to crash with panfrost_create_blend_state: Assertion `!blend->logicop_enable' failed.
icecream95 has quit [Ping timeout: 240 seconds]
stikonas has joined #panfrost
stikonas has quit [Ping timeout: 246 seconds]
NeuroScr has quit [Quit: NeuroScr]
karolherbst has quit [Ping timeout: 272 seconds]
Space_Man has joined #panfrost
karolherbst has joined #panfrost
Space_Man has quit [Remote host closed the connection]
raster has joined #panfrost
megi has joined #panfrost
<alyssa> logic ops are indeed not implemented
robmur01_ is now known as robmur01
<robmur01> "perf isn't accurate" - isn't really true; it's as accurate as the hardware allows
<robmur01> however you gots to know a bit about the hardware to interpret annotate output properly
<robmur01> the CPU can only take an interrupt at an instruction boundary, so if an instruction is simply slow to execute, (e.g. IDIV), the interrupt will only be taken once it finishes, thus the PC at that point is indicating the *next* instruction
<robmur01> (and if an out-of-order CPU can retire subsequent independent instructions in the same cycle, it may be 'smeared' even further ahead)
<robmur01> similarly, if branches take time to resolve (due to misprediction etc.), it's typically the branch *target* that shows up as a hotspot
<robmur01> for loads/stores things may get more subtle depending on whether the CPU does things like fetch cache lines before or after actually committing the instruction to execution
<robmur01> but on the other hand if you're waiting on instruction fetch (I$ miss), then chances are the hotspot *does* show the exact instruction you're waiting for
<raster> robmur01: time we go back to in-order and nuke all that pesky branch prediction mumbo... it messes with our perf traces.. :)
<raster> who needs clever & fast cpus...
jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]
jernej has joined #panfrost
<tomeu> robmur01: that's good to know, thanks
<tomeu> alyssa: I'm going to work next on a way to reliably share performance metrics, probably reusing tracie
<tomeu> because I find it hard to know how our changes affect performance
<robmur01> and of course on RK3399, unless you futz with taskset or CPU hotplug, instruction-level results will be some mish-mash of two completely different microarchitectures :)
<robmur01> in many cases A53 is actually 'faster' than A72 by cycle count, but the big cores win through parallelism and clocking higher
kaspter has joined #panfrost
<warpme_> dears: I just want to say HUGE THX for fantastic work You done with mesa/panfrost!. We (mythtv) team with mesa g8f5a252d3 achieved hours continues smooth playback decoded with amlogic stateful v4l2 hw decoder, cooperating with v4l2_m2m ffmpeg and rendering with zero-copy DRM-PRIME by myth to mesa3d EGL texture. Cpu load is just 3..5% for HD playback. Fantastic work I would say keeping in mind whole chain components:
<warpme_> HW video decoder, kernel v4l2 vdec driver, ffmpeg v4l2_m2m API to v4l2 vdec driver, myth player, DRM-PRIME zero-copy, 3D mesa panfrost rendering to screen. Fantastic work!
<tomeu> nice :)
<warpme_> ....and my amlogic sm1 (bifrost G31) just politelly awaiting the same portion of love from you ;-p
Space_Man has joined #panfrost
yann has joined #panfrost
alpernebbi has joined #panfrost
kaspter has quit [Remote host closed the connection]
kaspter has joined #panfrost
yann has quit [Ping timeout: 272 seconds]
buzzmarshall has joined #panfrost
alpernebbi has quit [Quit: alpernebbi]
rcf has quit [Quit: WeeChat 2.7]
rcf has joined #panfrost
alpernebbi has joined #panfrost
tgall_foo has joined #panfrost
karolherbst has quit [Ping timeout: 268 seconds]
<alyssa> warpme_: Woot!
karolherbst has joined #panfrost
mixfix41 has quit [Remote host closed the connection]
yann has joined #panfrost
yann has quit [Ping timeout: 268 seconds]
TheKit has joined #panfrost
shadeslayer has quit [Quit: Ping timeout (120 seconds)]
anarsoul|2 has joined #panfrost
thefloweringash has quit [Ping timeout: 246 seconds]
flacks_ has quit [Ping timeout: 246 seconds]
EmilKarlson has quit [Ping timeout: 252 seconds]
anarsoul has quit [Ping timeout: 240 seconds]
nhp[m] has quit [Ping timeout: 246 seconds]
Space_Man has quit [Ping timeout: 265 seconds]
Ke has quit [Ping timeout: 260 seconds]
Space_Man has joined #panfrost
thefloweringash has joined #panfrost
pH5 has quit [Quit: bye]
alpernebbi has quit [Quit: alpernebbi]
nhp[m] has joined #panfrost
nlhowell has joined #panfrost
<nlhowell> is X11 acceleration something that exists somewhere? i don't see support in libdrm
<alyssa> nlhowell: Implicitly, yeah :)
<alyssa> Older GPUs needed to have special X11 drivers written for them in addition to OpenGL drivers, since they had special 2D support in addition to the main 3D support
<alyssa> Nowadays, especially on arm, there's not much useful 2D support, so you just do everything through OpenGL
<alyssa> For X11 that's done through `glamor`, which is already included in X
<alyssa> So by default X with Panfrost should be accelerated through glamor (which is going through OpenGL internally)
<nlhowell> ah! so glamor support is probably what I am missing
<nlhowell> for some reason, I thought it was deprecated awhile back
<alyssa> X11 is deprecated, yes :)
<alyssa> ;)
<nlhowell> lol
<nlhowell> yes, i have a wayland compositor, but haven't found a terminal emulator to my liking
<nlhowell> on X I use rxvt-unicode
<nlhowell> though i am not enamoured of it, either
<nlhowell> no, it seems i have glamor support :/
<alyssa> alright, what's the issue then?
<alyssa> just slow? :|
<nlhowell> yes, and glxinfo shows a software renderer
<nlhowell> mesa-loader complains about vkms not being found in /usr/lib/dri/
* alyssa blinks
<nlhowell> and indeed, i have lots of dri .so's, but no vkms
<alyssa> what SoC/
<nlhowell> rk3288
<nlhowell> kernel log looks fine
<nlhowell> "[drm] Initialized panfrost 1.0.0 20180908 for ffa30000.gpu on minor 3
<nlhowell> "
pH5 has joined #panfrost
<nlhowell> is the last "panfrost" message i see
<nlhowell> it is a c201
<alyssa> Kernel sounds fine, then..
<alyssa> The real accomplishment was getting linux installed at all on a c201
* alyssa shivers
<nlhowell> lol
<nlhowell> yes, it has been a saga
<nlhowell> i first installed on it in 2015
<nlhowell> i am still running libreboot from back then :(
<nlhowell> it is only about two months ago i managed to get mainline running
<nlhowell> I was inspired to try again after seeing PrawnOS
<alyssa> Nice :)
<alyssa> IME, the most important thing required to install linux on a chromebook is having a Chromebook running Linux.
<nlhowell> lol
<alyssa> You laugh but I am completely serious :D
<nlhowell> actually, first try basically completely succeeded
<nlhowell> i followed instructions from the debian c201 install guide, and was pleasantly surprised (except for mainline kernel support)
<nlhowell> re: seriousness, oh, I agree! how many times have i accidentally wiped it, and found no rescue disks with cgpt
* alyssa has blocked today off to reinstalling one of her machines
<urjaman> I'm too tired to debug your C201, but just make sure you're actually using the mesa you built (assuming you built from git/master)
<alyssa> progress: very little.
<urjaman> (/usr/local/lib in ld.so.conf, ldconfig ...)
<nlhowell> urjaman: no worries, thanks for the advice :)
<alyssa> # !dd
<alyssa> ^ installing on hard mode.
<nlhowell> oy :(
<nlhowell> ok, verbose X log says "refusing to try glamor on llvmpipe"
<nlhowell> after that "glamor initialisation failed"
<urjaman> yeah ... as i said. get a mesa that does the panfrost thing and have X use it
<nlhowell> before this it tries and failes to load "fbdev" and te "vkms" driver
<nlhowell> urjaman: yes, the only copy of mesa i have has panfrost enabled
<urjaman> clearly doesnt as it picks llvmpipe
<nlhowell> but before that i have "glamor X acceleration enabled on panfrost"
<nlhowell> and then "glamor initialised"
<nlhowell> *then* the errors
<nlhowell> which is funny
* urjaman is confused
<nlhowell> that makes two of us :)
<nlhowell> but i have learned a bit more
<nlhowell> so i can continue debugging on my own for a bit
<nlhowell> urjaman and alyssa, thanks to both of you!
* alyssa is double confused
<urjaman> so i was only float confused?
<nlhowell> oh, interesting additional fact: i have a /usr/lib/dri/panfrost_dri.so
<nlhowell> i wonder if it is actually using it
MastaG has quit [Quit: Ping timeout (120 seconds)]
MastaG has joined #panfrost
stikonas has joined #panfrost
Ke has joined #panfrost
raster has quit [Quit: Gettin' stinky!]
<alyssa> okay I got a shell
guillaume_g has quit [Quit: Konversation terminated!]
nerdboy has quit [Ping timeout: 268 seconds]
davidlt_ has joined #panfrost
davidlt has quit [Ping timeout: 265 seconds]
<alyssa> Woohoo! I officially booted system 1!
<alyssa> ....now for system 2, the much harder step :V
davidlt_ has quit [Ping timeout: 272 seconds]
<robmur01> how long until System V? :P
thefloweringash has quit [Quit: killed]
nhp[m] has quit [Quit: killed]
Ke has quit [Quit: killed]
raster has joined #panfrost
<alyssa> :P
nerdboy has joined #panfrost
flacks_ has joined #panfrost
<alyssa> "The device you inserted does not contain ChromeOS"
<alyssa> Uh, yeah, thanks depthcharge, I didn't realize my Debian image didn't contain ChromeOS
<anarsoul|2> :)
anarsoul|2 is now known as anarsoul
<alyssa> It boots!
<alyssa> Only took *glances at watch* 6 hours
<HdkR> Boot to the head sort of boot?
* HdkR puts the joke in the boomer jar
jernej has quit [Remote host closed the connection]
mixfix41 has joined #panfrost
jernej has joined #panfrost
EmilKarlson has joined #panfrost
thefloweringash has joined #panfrost
nhp[m] has joined #panfrost
yann has joined #panfrost
yann has quit [Ping timeout: 240 seconds]
mifritscher has quit [Quit: Quit]
mifritscher has joined #panfrost
NeuroScr has joined #panfrost
pH5 has quit [Quit: -_-]
raster has quit [Quit: Gettin' stinky!]