alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
<alyssa> italove: looking good on the disasm, I think if you cleanup the series (squashing so everything is bisectable, I mean -- no need to write expansive commit messages or anything) we're pretty close to landable :)
<alyssa> chewitt: snrkt
Danct12_ has quit [Quit: Quitting - Huong Tram IRC Client 1.54]
Danct12 has joined #panfrost
popolon has quit [Quit: WeeChat 3.0.1]
vstehle has quit [Ping timeout: 245 seconds]
stikonas has quit [Remote host closed the connection]
atler is now known as Guest91365
Guest91365 has quit [Killed (card.freenode.net (Nickname regained by services))]
atler has joined #panfrost
camus has joined #panfrost
kaspter has quit [Ping timeout: 256 seconds]
camus is now known as kaspter
<macc24_> alyssa: did bifrost optimizations make it into 21.0 release?
<alyssa> no
_whitelogger has joined #panfrost
archetech has quit [Quit: Konversation terminated!]
mixfix41 has quit [Ping timeout: 264 seconds]
davidlt has joined #panfrost
kaspter has quit [Ping timeout: 264 seconds]
kaspter has joined #panfrost
vstehle has joined #panfrost
cowsay has quit [Quit: No Ping reply in 180 seconds.]
cowsay has joined #panfrost
mani_s_ is now known as mani_s
guillaume_g has joined #panfrost
hexdump0815 has joined #panfrost
camus has joined #panfrost
kaspter has quit [Ping timeout: 265 seconds]
camus is now known as kaspter
MastaG has joined #panfrost
raster has joined #panfrost
<HdkR> Does Panfrost support threaded context yet? :)
catfella has quit [Remote host closed the connection]
stikonas has joined #panfrost
stikonas_ has joined #panfrost
camus has joined #panfrost
kaspter has quit [Ping timeout: 265 seconds]
camus is now known as kaspter
stikonas has quit [Ping timeout: 272 seconds]
camus has joined #panfrost
kaspter has quit [Ping timeout: 276 seconds]
camus is now known as kaspter
kherbst has quit [Ping timeout: 246 seconds]
stikonas_ is now known as stikonas
karolherbst has joined #panfrost
hexdump0815 has quit [Ping timeout: 240 seconds]
davidlt_ has joined #panfrost
davidlt has quit [Remote host closed the connection]
kaspter has quit [Ping timeout: 276 seconds]
camus has joined #panfrost
camus is now known as kaspter
<italove> alyssa: ok :)
Danct12 has quit [Quit: Quitting - Huong Tram IRC Client 1.54]
kaspter has quit [Ping timeout: 276 seconds]
kaspter has joined #panfrost
<alyssa> HdkR: not yet
<alyssa> though now that iris/fd do it shouldn't be so bad
<bbrezillon> stepri01: may I ask you a few questions about the NO_IMPLICIT/NO_FENCE case?
<stepri01> bbrezillon: sure
<bbrezillon> I guess the idea is to limit the number of fences to wait on to a single fence, so the scheduler gets to schedule our job as soon as this fence is signaled
<bbrezillon> is that correct?
<stepri01> there's no need to have to limit the number of fences as such. It's more about letting user space control the fences so you don't have unnecessary fences
<stepri01> Ultimiately user space (usually) has a good idea what the actual dependencies are between jobs, so letting it encode that rather than trying to deduce it from the implicit fences can be beneficial
<stepri01> e.g. you don't need to have implicit fences on things user space knows are effectively immutable - so we can save time by not processing those fences
<stepri01> equally there are complex situations such as sub-buffer accesses which user space can optimise by fencing appropriately, whereas the kernel doesn't know how they might or might not conflict
<stepri01> the blob/kbase mostly use the expicit fencing approach for the above reasons, only using implicit fencing when necessary because it's an imported buffer
<bbrezillon> ok, the sub-buffer case I had it, but the front-buffer update one you mentioned in your reply I don't see what it is
<stepri01> so in the normal double (or more) buffering case it makes sense the for display driver to hold a (shared) lock on the buffer that's being scanned out. That allows you to schedule a buffer swap and immediately send the kernel GPU work which would render to what was (and for a while still will be) the front buffer
<stepri01> The GPU work will block until the display driver releases the lock when flipping to a back buffer, unblocking the GPU and allowing the rendering to happen straight away
<stepri01> Clearly this falls down if for whatever reason you then want to actually render to the displayed buffer. Either you need to have a way of reconfiguring the display driver not to hold the lock (i.e. fence) or you need to convince the GPU driver to ignore the fence
<stepri01> Usually you can get away with using a shared access on the GPU (even though you are actually writing), but I seem to remember there are corner cases even with that
<bbrezillon> ok, but how does NO_IMPLICIT simplify/optimizes this case. I mean, I'd expect it to work similarly with the implicit fences: the GPU job will be blocked until the display controller signals the front-buffer fence
<bbrezillon> the only different being the number of fences to test
<bbrezillon> *difference
<stepri01> I think there's two things. First there is overhead juggling the unnecessary fences in the kernel - whether that's measurable I don't know.
<stepri01> Secondly you need to be able to use shared fences (a problem with the current Panfrost kernel) and you need to ensure that any other drivers you are working with also support shared fences
davidlt_ is now known as davidlt
<bbrezillon> stepri01: ok, after looking at the KMS API more closely I get why GPU drivers take a sync_file FD and not a syncobj: that's what atomic plane updates return (passing a syncobj would require importing the sync_file first)
Elpaulo has quit [Quit: Elpaulo]
<bbrezillon> and the NO_IMPLICIT mechanism makes more sense now. Thanks for the detailed explanation
<stepri01> no problem :)
<raster> oh noes! i got a guru meditation
<alyssa> ?
<raster> that seems to be a result of...
<raster> [ 7126.575518] Internal error: Oops: 96000006 [#1] SMP
<raster> :)
<alyssa> "you wedged the GPU and the kernel is too broken to fix it"
<raster> and i had such good uptimne.. :| about 2h! :)
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<daniels> stepri01: KMS synchronises against exclusive fences before making the framebuffer current, but that's the last involvement it has; as soon as it's synchronised against all fences placed before the commit was made, it doesn't do anything else related to fencing, including holding a shared reservation
<stepri01> daniels: To be fair I'm more familiar with how Android (used) to do these things, I'm not so familiar with KMS. There are also cases like a video encoder reading the buffer for use cases like casting to a remote screen.
<daniels> yep, the video encoder will take a shared fence, but I'd argue that the number of people doing active frontbuffer rendering (X11, XR) and simultaneous streaming from that frontbuffer are ... like none?
<stepri01> I think it can be done on Android (cast while using the phone as an VR headset), but it was a while ago when I was involved in such discussions. Like you say it's pretty rare
<daniels> racing the encoder against scanout seems pretty brave, but what do I know :P
<stepri01> Yeah - I can't remember the details these days. The main thing is that you need to have a design that can at least support both independently. And it's much better if the GPU doesn't need to change too much based on exact use case
<macc24_> HdkR: how would you test if threaded context actually works?
stepri01 has quit [Quit: leaving]
Depau has quit [Quit: ZNC 1.8.2 - https://znc.in]
Depau has joined #panfrost
popolon has joined #panfrost
warpme_ has quit [Quit: Connection closed for inactivity]
kaspter has quit [Ping timeout: 276 seconds]
kaspter has joined #panfrost
warpme_ has joined #panfrost
robmur01 has quit [Quit: Leaving]
davidlt has quit [Ping timeout: 264 seconds]
<icecream95> macc24_: I rebased my threaded_context branch onto master and it still seems to mostly work: https://gitlab.freedesktop.org/icecream95/mesa/-/commits/tc-rebase
* macc24_ yoinks the code
guillaume_g has quit [Quit: Konversation terminated!]
<alyssa> "mostly"
<icecream95> Although I'm sure STK was broken with the old branch, I haven't yet found anything that doesn't work this time
<alyssa> If you aren't breaking STK are you really writing a driver?
<macc24_> alyssa: STK didn't break in last month
<macc24_> iirc
<icecream95> That reminds me to bisect the latest STK regression
<macc24_> oh nvm
<alyssa> lol
<alyssa> I really like Gallium
<icecream95> The regression doesn't reproduce on master, so looks like I did break STK after all
<icecream95> macc24_: Note that my branch disables AFBC and the minmax cache so is likely to just slow things down at the moment
<macc24_> icecream95: is there any gain from afbc on duet?
<icecream95> macc24: Test for yourself (PAN_MESA_DEBUG=noafbc to disable AFBC)
<macc24_> icecream95: fun fact: you can usually press tab to autocomplete nick on irc in most clients
<macc24_> anyway, i
<macc24_> i'll test noafbc on odroid go advance when i'm done with getting analog stick to work
<HdkR> macc24_: Check for a bunch of new threads from mesa :P
<macc24_> HdkR: for (int i = 0; i < 5; i++) fork();
<HdkR> hah
kaspter has quit [Ping timeout: 276 seconds]
kaspter has joined #panfrost