#panfrost on 2018-10-29 — irc logs at freenode.irclog.whitequark.org

2018-10-28 05:31 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - Discord Discard

00:25 <HdkR> Lyude: How much do you know about GPU hotpluging? :P

00:26 <alyssa> That's a thing? O_O

00:26 <HdkR> Of course. eGPU is a significant use case now

00:26 <urjaman> yes

00:26 <HdkR> bnieuwen1uizen: Speaking of which, how does mesa + radeon handle this? :P

00:26 <alyssa> that's a little unsettling

00:27 <HdkR> Fun fact, ARM devices could use Thunderbolt + eGPU to have the same issue

00:27 <alyssa> That's it done with graphics

00:27 <HdkR> Just need 4x PCIe lanes + buying a chip more expensive than the SoC it is attached to

00:28 <bnieuwen1uizen> HdkR: handlke what?

00:28 <HdkR> bnieuwen1uizen: Surprise hotplug of GPUs

00:28 <HdkR> I have literally zero idea how X/Wayland reacts, but as long as the driver is sane then...eh?

00:28 <bnieuwen1uizen> well, the kernel driver exposes them, next time an app lists all devices it will show up

00:29 <urjaman> based on very little info (some comments on an lwn post i read a while ago) my guess is: not very well :P

00:29 <bnieuwen1uizen> as far as how it handles unplugs, no clue

00:29 <HdkR> https://twitter.com/whitequark/status/1056465535477710856 Someone linked me to this this morning which is why I'm curious about it

00:29 <bnieuwen1uizen> also no comment on whether the kernel driver is buggy for this case :P

00:29 <HdkR> :D

00:30 <bnieuwen1uizen> I think if you'll just fail any ioctl we eventually get the message and will return a DEVICE_LOST error*

00:30 <bnieuwen1uizen> *once the kernel driver is bug-free enough for us to get motivation to write the handling :P

00:31 <HdkR> Lets rig up a CI machine that just pulls out and plugs in thunderbolt cables for maximum silliness

00:31 <bnieuwen1uizen> which is actually around now, since gpu reset has just been declared stable enough to enable it for some

00:31 <HdkR> Neat

00:31 <bnieuwen1uizen> of course you'll still lose all VRAM so it is not quite transparent, but not unlike a hotplug in that regard?

00:32 <HdkR> Yea, fairly similar

00:32 <HdkR> Just that it can never recover

00:33 <bnieuwen1uizen> device lost says nothing about device being available again :P

00:33 <HdkR> aye

00:33 <bnieuwen1uizen> hmm, now that I think about it, is the app required to relist the physical devices?

00:33 * bnieuwen1uizen checks

00:33 <HdkR> Tell that to nvidia-uvm and... NVRM? I think is the one complaining in that twitter post

00:34 <bnieuwen1uizen> wow, "In some cases, the physical device may also be lost, and attempting to create a new logical device will fail, returning VK_ERROR_DEVICE_LOST."

00:34 <bnieuwen1uizen> How I love the vulkan spec thinking about corner cases

00:34 <HdkR> Nice

00:45 <alyssa> Vulkan is too pure for this world

00:45 <bnieuwen1uizen> wha why?

00:45 <alyssa> bnieuwen1uizen: How I love the vulkan spec thinking about corner cases

00:45 <alyssa> Humans aren't ready for this

00:46 <bnieuwen1uizen> hmm, I should have phrased that as how it love it when the vulkan spec thinks of a corner case.

00:46 <bnieuwen1uizen> There are lots where it does not :P

00:46 <alyssa> :P

00:46 <alyssa> Can I just

00:46 <alyssa> OpenGL's blending is _madness_

00:46 <alyssa> just like

00:47 <HdkR> AMD + Khronos made a good spec, then Nvidia comes along and sticks a bunch of GL garbage in it. Nobody wants subroutines :P

00:47 <bnieuwen1uizen> well, vulkan has no subroutines?

00:47 <alyssa> Isn't "opaque" and "transparent" enough for all sane applications and then just offer a programmable fallback or something? Meh

00:47 <HdkR> bnieuwen1uizen: NVX_raytracing adds them

00:47 <HdkR> :D

00:47 <bnieuwen1uizen> ... why?

00:47 <bnieuwen1uizen> also it is NV only and experimental only?

00:47 <HdkR> Good question

00:48 <bnieuwen1uizen> then again, Khronos does not want KHX anymore because it is confusing, time for NV to let go of NVX?

00:48 <HdkR> The NVX extension will quickly die and it'll convert to NV_

00:50 <alyssa> seriously uh

00:50 <HdkR> If it ever wants to be core Vulkan then something will need to change with that callable bit :)

00:50 <alyssa> what's the use case of literally anything but opaque and alpha blending?

00:50 <bnieuwen1uizen> alyssa: what other kinds of blending are you talking about?

00:52 <bnieuwen1uizen> also if you want lots of weirdness where you see no point of an app ever using it, try logical operations

00:52 <HdkR> Oh hey, Dolphin uses those

00:52 <HdkR> ;)

00:53 <HdkR> and games mix logic ops with blending, so wtf

00:53 <bnieuwen1uizen> HdkR: how did I know radv did not implement them? :P

00:53 <HdkR> :D

00:53 <alyssa> bnieuwen1uizen: Literally any other argument to glBlend*

00:53 <bnieuwen1uizen> but logical ops can do stuff like a andnot or a nand on the color output

00:53 <alyssa> CONSTANT_COLOR/ALPHA, weird tricks with e.g. source factor = destination color, etc

00:54 <HdkR> Like GL_ONE, GL_ONE, etc?

00:54 <HdkR> ..

00:54 <alyssa> mm

00:54 <HdkR> GL_ZERO, GL_ONE, CONSTANT, yea

00:54 <bnieuwen1uizen> probably for strange stuff like approximations of order independent blending?

00:54 <bnieuwen1uizen> subtract might be useful for fog

00:55 <HdkR> You can also do wacky things like generating a mask

00:55 <HdkR> I've seen this happen for sprite generation with a mask

00:55 <HdkR> Did some magic with intersection testing with the mask they generated I think

00:55 <bnieuwen1uizen> funny thing: dota2 uses a stencil mask to only render parts that are not obscured by the UI

00:56 <bnieuwen1uizen> (not blending but still)

00:56 <HdkR> Neat

00:57 * bnieuwen1uizen is still annoyed that nobody uses VK_EXT_discard_rectangles for this (or the GL equivalent)

00:57 <HdkR> Got to get as much performance as possible so dota can run on the slowest of hardware

00:57 <bnieuwen1uizen> HdkR: or for VR

00:57 <bnieuwen1uizen> though getting 90 fps on a threadripper is pretty hard

00:57 <HdkR> oof

00:58 <bnieuwen1uizen> seriously, be prepared to get like -20% perf due to too much threads

00:59 <HdkR> anything that communicates cross-CCX is going to hurt

00:59 <alyssa> Wee

00:59 <alyssa> https://en.wikipedia.org/wiki/Blend_modes is enlightening

00:59 <alyssa> Never understand overlay till now so that's cool

01:00 <HdkR> Don't you love having photoshop blend modes implemented in hardware? :D

01:00 <alyssa> I mean

01:00 <alyssa> I'd rather just have blend shaders but

01:00 <HdkR> I think it is funnier to have them implemented in hardware

01:00 <alyssa> Then again Midgard does just use blend shaders

01:00 <alyssa> but pretends it's hardware

01:01 <HdkR> When you don't have any form of programmable blending :)

01:01 <alyssa> and then when your performance nosedives we just shrug

01:01 <HdkR> hehe

01:01 <bnieuwen1uizen> alyssa: you blend shader time will come probably, with GL_KHR_blend_equation_advanced

01:01 <alyssa> Seriously would it be so terrible to at least document which ones are accelerated and which ones are sw? :P

01:02 <alyssa> bnieuwen1uizen: Yeah that's all shaders

01:02 <alyssa> but also like

01:02 <bnieuwen1uizen> (needed for the GLES AEP)

01:02 <alyssa> some pure ES 2.0 blend modes are shader

01:03 <alyssa> ....ES 3.2 spec is 600 pages

01:03 <alyssa> Ugh

01:03 <HdkR> It's big

01:04 <bnieuwen1uizen> well, unless you are ROP bound, I'd expect a blend shader to be not too terrible?

01:04 <HdkR> It's like GL 4.x with a bunch of dumb removed

01:04 <bnieuwen1uizen> (for comparison, vulkan is like 1767 already)

01:05 <bnieuwen1uizen> you can ignore all the extensions though

01:05 <alyssa> Oh dear

01:06 <alyssa> 3.0 is only 350 pages. That seems a lot more managable

01:06 <alyssa> (2.0 is 200 pages and we essentially have the big stuff there down)

01:07 <bnieuwen1uizen> the real question is why care about the GL spec if you're doing gallium ;)

01:07 <alyssa> bnieuwen1uizen: I mean

01:07 <alyssa> all of the tests I'm using are GL

01:07 <alyssa> all the apps I care about are GL

01:07 <alyssa> and the hardware is rated for a GL version level, not a Gallium one

01:12 <HdkR> Guess there is something to be said for knowing the enemy from from the spec to know why you're implementing something in gallium

01:12 <HdkR> Hard to know what an SSBO is if you're never once read the GL spec :)

01:13 <HdkR> Or done nothing with GL

01:14 <bnieuwen1uizen> right, but a quick introduction is very different from parsing standardese

01:17 <alyssa> bnieuwen1uizen: also, it's invaluable for RE

01:17 <alyssa> at least for my workflow

01:18 <alyssa> (Something I imagine you don't have to deal with for your hw?)

01:20 <bnieuwen1uizen> less, but you'd be surprised

01:20 <alyssa> rip

01:21 <bnieuwen1uizen> sometimes you have interesting register fields with very cryptic names: CB_HW_CONTROL_3__DISABLE_ROP3_FIXES_OF_BUG_511967

01:21 <alyssa> ah

01:21 <bnieuwen1uizen> and you have large swathes of registers/commands for which the existence is not documented, and for most we only know the name but not what they do

01:22 <bnieuwen1uizen> most that we don't already use*

01:22 <alyssa> Nice.

01:22 <alyssa> surely somebody has the Verilog (or whatever)? :P

01:22 <HdkR> That's when you have to shoot off an email and hope someone answers your question about the register :D

01:23 <bnieuwen1uizen> well, we have a partial leaked register documentation for an older generation chip that the linux on PS4 developers found

01:23 <HdkR> Which is funny

01:23 <bnieuwen1uizen> and the rest is just RE, get a hint from the name and see how it works

01:24 <bnieuwen1uizen> the funniest changes are IMO when Mareko comes with some magic number changes for some situation, claiming it is faster, and routinely I can't find/create a bench that shows it is faster

01:24 <alyssa> bnieuwen1uizen: (don't you work for the same company that makes the chip?)

01:24 <bnieuwen1uizen> alyssa: what gave you that impression?

01:25 <alyssa> Not sure. Too many people to keep track of

01:25 <alyssa> er wait

01:25 * bnieuwen1uizen sometimes doubt AMD has competent documentation themselves

01:25 <alyssa> You're Valve working on AMD?

01:25 <bnieuwen1uizen> Half hobby, half 20% project at Google

01:25 <alyssa> ...huh, ok

01:25 <alyssa> I can barely keep track of the ARM GPU space as it is

01:25 <bnieuwen1uizen> and yes, AMD HW

01:26 <HdkR> hehe

01:26 <bnieuwen1uizen> so, what if somebody runs AMD GPUs on an ARM host? Isn't it included in your GPU space already? ;)

01:27 <alyssa> By ARM GPUs, I meant GPUs produced by ARM ;)

01:27 <bnieuwen1uizen> ah

01:27 <alyssa> (Actually I included Adreno, VideoCore, and Vivante but yeah)

01:27 <bnieuwen1uizen> well, that is a pretty wide group :P

01:28 <alyssa> ...and I can barely keep track :)

01:28 <bnieuwen1uizen> then again, even most people which do keep track tend to have the wrong impression about my affiliation :P

01:29 <bnieuwen1uizen> always interesting at XDC telling half the people you don't work for AMD

01:29 <HdkR> haha

01:29 <HdkR> I just like poking you about random AMD hardware quirks :)

01:30 <bnieuwen1uizen> hey I do work on AMD HW, that is totally valid B)

01:30 <alyssa> bnieuwen1uizen: I mean, it's confusing since AMD does support foss drivers

01:30 <alyssa> With me there's no ambiguity since nobody is funding free Mali :p

01:31 <bnieuwen1uizen> alyssa: have you heard about this situation with two AMD open-source Vulkan drivers?

01:31 * bnieuwen1uizen is working on the one that is not supported by AMD

01:31 <HdkR> <3 that situation

01:32 <bnieuwen1uizen> (which shares the most code with the GL driver, that is supported by AMD, to make things more confusing)

01:33 <bnieuwen1uizen> HdkR: the big question is: how are we ever to get out of this situation in a reasonable way? :P

01:33 <alyssa> bnieuwen1uizen: I have and don't understand it

01:34 <HdkR> I think the reasonable way is that the other one dies

01:34 <alyssa> how can there be competition if they're uh

01:34 <alyssa> both foss

01:34 <bnieuwen1uizen> alyssa: competition between which development team gets funding?

01:35 <bnieuwen1uizen> I mean at the end of the day it is a question of whether we can do two driver mediocre in their own way or one great driver

01:35 <alyssa> Code sharing tho?

01:35 <alyssa> or is it too different

01:35 <bnieuwen1uizen> with the same developer resources

01:35 <bnieuwen1uizen> some of it, but not all of it right now. It is complicated

01:36 <bnieuwen1uizen> HdkR: the AMD driver dying is a long ways of, like I'd expect this to sudder for 5 years or so unless the radv side gives up ...

01:37 * bnieuwen1uizen gets the feeling he is ranting too much about AMD in a Mali channel

01:37 <HdkR> haha

01:38 <alyssa> levenstein distance of 2 is ARM, so

01:38 <alyssa> *levenshtein

01:57 <HdkR> `class SleepyLatentWorker(SleepyBaseWorker):`

01:57 <HdkR> Oh such a sleepy baby

02:00 <alyssa> Nini

02:09 <alyssa> The good news is that the cmdstream side of blend shaders is reasonable

02:09 <alyssa> And there's not a lot of ABI stuff to worry about, I don't thin

02:10 <alyssa> i.e. I can compile a blend shader with the blob and use that for starting out, independent of being able to generate them from the compiler

02:10 <alyssa> ("Alyssa, have you lost your mind?" "I think I left it in Galicia")

02:17 <anarsoul> alyssa: btw, I just tested latest lima and weston works here :P

02:17 <alyssa> anarsoul: hooray! :D

02:17 <alyssa> (what's the :P for?)

02:18 <anarsoul> for nothing

02:18 <anarsoul> ignore it

02:18 <anarsoul> :)

02:19 <urjaman> i've noticed i have a similar issue lol

02:19 <alyssa> :P

02:20 <alyssa> anarsoul: for my vain interest

02:20 <alyssa> could you test which scenes in glmark do/don't work?

02:20 <urjaman> it was extremely hard not to end that message with ":P"

02:20 <alyssa> (https://rosenzweig.io/glmark.txt for reference on Panfrost progress)

02:24 bnieuwenhuizen has joined #panfrost

02:27 * alyssa dumps a blend shader

02:28 <urjaman> ...

02:29 <alyssa> urjaman: what

02:29 <alyssa> I'll add 6 to whatever the number is to compensate for the kernelspace :p

02:31 <urjaman> nvm

02:31 * alyssa is confused

02:32 * urjaman didnt parse "dump" correctly

02:33 <alyssa> geez urja, I'm not _dating_ the shader!

02:33 * alyssa is exclusive with Kevin

02:36 <alyssa> Hm, so injecting a blend shader I'm getting an OPER_FAULT

02:36 <alyssa> There's probably a work_register_count field somewhere here

02:42 <alyssa> (The good news is that it's definitely executing the shader, or at least trying)

02:48 <alyssa> Who needs fragment shaders when you have blend shaders?!

02:48 <alyssa> :P

02:53 <urjaman> i dont know much but i guess both would be optimal :P

03:02 <alyssa> okay what

03:03 <alyssa> AH!

03:03 <alyssa> It reuses work_count what lol

03:03 <alyssa> so er

03:03 <alyssa> uh-huh

03:04 <alyssa> Well then

03:04 <alyssa> First blend shader injected successfull!

03:05 <alyssa> (Admittedly it's an uphill battle since This Was The Easy Part..)

03:10 <alyssa> Let's clean up that code and write some docs

03:15 paulk-leonov has quit [Ping timeout: 272 seconds]

03:16 paulk-leonov has joined #panfrost

04:41 embed-3d has joined #panfrost

05:17 jernej has joined #panfrost

05:18 <alyssa> You know, let's demoify this

05:19 <alyssa> Context: with some hacks added to the assembler for uint8/fp16/etc stuff, I can now write a blend shader

05:19 <alyssa> (manually)

05:20 <alyssa> I'm not ready for the compiler to start outputting this stuff, but I could bundle just this one shader and hot patch into the constants, to finish up the demo I was trying to do :P

05:24 <alyssa> Cool!

05:25 <alyssa> Just pushed a set of changes to be able to assemble the shader

05:25 <alyssa> (It's simpler than the shader the blob emits, unclear if I'm missing functionality or what. Shrug)

05:26 <alyssa> 12 files changed, 519 insertions(+), 54 deletions(-)

05:26 <alyssa> been busy today

05:41 chewitt has quit [Quit: Zzz..]

05:45 afaerber has quit [Quit: Leaving]

05:47 jernej has quit [Ping timeout: 240 seconds]

05:53 _whitelogger has joined #panfrost

06:06 jernej has quit [Ping timeout: 244 seconds]

06:17 _whitelogger has joined #panfrost

07:17 indy has quit [Read error: Connection reset by peer]

07:21 indy has joined #panfrost

07:36 jernej has joined #panfrost

08:25 paulk-leonov has quit [Ping timeout: 246 seconds]

08:26 paulk-leonov has joined #panfrost

09:18 pH5 has joined #panfrost

11:09 afaerber has joined #panfrost

11:13 TheCycoONE has quit [Read error: Connection reset by peer]

11:13 TheCycoONE has joined #panfrost

15:41 afaerber has quit [Quit: Leaving]

16:07 TheCycoONE has quit [Quit: ZNC 1.7.1 - https://znc.in]

16:10 TheCycoONE has joined #panfrost

16:54 cwabbott_ has joined #panfrost

16:57 cwabbott has quit [Ping timeout: 250 seconds]

16:57 cwabbott_ is now known as cwabbott

17:08 pH5 has quit [Quit: bye]

17:45 anarsoul|2 has joined #panfrost

17:51 pH5 has joined #panfrost

19:03 jernej has left #panfrost ["Konversation terminated!"]

19:04 jernej has joined #panfrost

19:07 jernej has quit [Quit: ZNC 1.6.5-elitebnc:7 - http://elitebnc.org]

19:18 jernej has joined #panfrost

20:11 tlwoerner has joined #panfrost

22:11 mearon has joined #panfrost

22:13 <mearon> Hey all. I own a Chromebook Plus. And today I discovered about panfrost (I honestly thought this was never going to be happen). This really made my day!

22:13 * mearon will have sweet dreams tonight

22:14 <mearon> A big Thank You to all the devs :3

22:37 <alyssa> mearon: <2

22:37 <alyssa> erm

22:37 <alyssa> <3

22:38 <HdkR> <4

22:38 <HdkR> 4>?

22:39 <HdkR> 4:) The 4 looks like a little hat =o

22:39 <alyssa> HdkR: Hey question

22:39 <alyssa> What are your thoughts about patching compiled binaries in real-time?

22:40 <anarsoul|2> alyssa: don't do it

22:40 <alyssa> anarsoul|2: why not tho

22:40 <HdkR> It's a common practice though

22:40 <HdkR> Either that or maintaining a set of known good blobs with patch points that you append to the end

22:41 <alyssa> HdkR: No, not blobs

22:41 <alyssa> Like, compile the shader once and save the patch point, and then patch it for later runs

22:42 <alyssa> (The glBlendColor is an immediate hardcoded into the blend shader. I don't want to reinvoke the compiler just because you changed the color. That's dumb :P)

22:43 <alyssa> Just writing to a standard vec4 in the binary, alignment is sane, etc

22:43 <alyssa> So it's as easy as

22:43 <HdkR> When you're patching are you thinking saving a copy of the shader to not have to repatch when it goes back to the previous blend mode?

22:43 <alyssa> blend color doesn't work like that :p

22:43 <alyssa> If the _mode_ changes, we're forced to recompile of course

22:44 <alyssa> memcpy((float *) ((uintptr) shader_binary) + color_offset), color, sizeof(float) * 4);

22:44 <HdkR> I mean when the application changes the blend with a another API call, another draw call, but uses the same program

22:44 <alyssa> I'm confused what the question is.

22:44 <anarsoul|2> alyssa: ah, so you're talking about shader binaries...

22:45 <HdkR> I'm confused what the use case is

22:45 <alyssa> Impedence mismatch between Gallium and Mali, I guess

22:46 <alyssa> In Gallium, the blend mode is part of a constant-state object (which is cached nicely and whatever), so I can elegantly express "compile on CSO create, attachment is free, so we compile only once and then the application can flip-flop how much it wants"

22:46 <alyssa> The wacky exception is the blend color, which is _not_ part of the CSO since it's too variable I guess

22:47 <alyssa> So I shouldn't try to back it by CSO either -- I should just make updating the blend color fast

22:47 <alyssa> And patching the shader binary directly seems like a really good solution technically, even if it sounds ugly

22:47 <HdkR> Just make blend color be a uniform?

22:48 <alyssa> Blend shaders don't have uniforms :p

22:48 <alyssa> No choice but to hardcode it

22:48 <HdkR> Does it have multiple input values that can be passed from the fragment side?

22:48 <HdkR> Data other than colour

22:48 <HdkR> er, output colour*

22:49 <alyssa> Don't think so

22:50 <HdkR> So number of input colours has to match number of blending output colours in the blend stage?

22:50 <alyssa> Erm

22:50 <HdkR> (Also Constant colour isn't used extensively so it may be worth eating the recompile and deal with the issue later)

22:51 <alyssa> Look uh

22:51 <alyssa> What is your objection to it? I thought it was a really cute solution why are we aruing :P

22:52 <HdkR> I'm attempting to explore all choices before suggesting patch points because yes they area usually pretty ugly

22:52 <alyssa> Hmph

22:52 <HdkR> s/area/are

22:52 <alyssa> If there's ever a clean patch point thing, it's this

22:52 <alyssa> Since:

22:52 <alyssa> - We're patching data, not code

22:52 <alyssa> - The patch points are computed dynamically as a compiler output, not hardcoded offsets into blobs

22:53 <alyssa> - It avoids passing into the constant to the compiler context directly which goes against any reasonable model :p

22:53 <alyssa> - From the compiler side, it's really easy to implement

22:53 <alyssa> - From the cmdstream side, it's even easier to implement

22:54 <alyssa> - Avoids ridiculous CPU overhead

22:54 <HdkR> Yea, there are lots of upsides to it

22:54 <alyssa> Look I just think blend shaders are adorable and I've always wanted to say "Oh, yeah, I patch compiled binaries in real-time" *waggles eyebrows*

22:55 <HdkR> So does the PS3 though, it isn't completely archaic :P

22:55 <HdkR> Although that has to patch shader programs to emulate uniforms

22:55 * HdkR shivers

22:57 <urjaman> lgtm :P (actually, to me a memcpy like that would seem a bit over the top, but the compiler just saving a pointer to a struct with the appropriate color data (or data to make said ptr) and later using it, fine...)

22:57 <alyssa> urjaman: I wouldn't actually do a memcpy; that was just to fit it on one-line for irc

22:58 <HdkR> alyssa: So another question. What if you don't do a full recompile in case blending mode changes and instead just generate blend shaders that are reused to match a program's blending. Then your constant one you just patch + give to the duplicated program for that state?

22:59 <alyssa> I'm confused

22:59 <urjaman> yeah me too

22:59 <alyssa> Are you suggesting ubershaders

22:59 <alyssa> Because if so, go away Dolphin shill :v

22:59 <alyssa> :p

23:00 <HdkR> Think you have three draw calls with different blend state. One with "regular" blend modes, one with GL_CONSTANT with constant A, then GL_CONSTANT with constant B. The application spams between the three but they all use the same original shaders

23:01 <HdkR> Do you patch the constant versions of the programs between every draw call or retain a unique copy for each?

23:02 <alyssa> Patch between them but again, patching is free

23:02 <alyssa> (Well, not free, but.... let's say it costs 10 cycles per patch :P)

23:04 <HdkR> Does this mean you have to flush the previous draw call before modifying the blend stage to make sure you don't get corrupted output for modifying the constant colour live while rasterization is still happening?

23:06 <alyssa> I mean you can duplicate the shader in memory

23:06 <HdkR> That's what I'm getting to

23:06 <alyssa> this is bikeshedding

23:06 <alyssa> :P

23:06 <HdkR> Or trying to and constantly failing by sticking foot in mouth

23:07 <HdkR> That's the main thing I was wondering about though D:

23:07 <HdkR> Duplicate program then modify constant color, or modify program and not duplicate

23:07 <alyssa> idk that's a minor detail rn :p

23:07 <HdkR> Because if you're duplicating then I don't see an issue with it

23:08 <HdkR> Eats some additional memory but programs are typically small

23:09 <HdkR> The workaround for that being if you can pass a uniform in to the fragment stage in the case of constant color and passing that uniform to the blend stage then the overhead is just filling a uniform

23:10 <alyssa> Blah

23:10 <HdkR> :)

23:11 <HdkR> Just wait until you have a shader that live modifies itself

23:12 <alyssa> that's not legal on our hardware :p

23:13 <HdkR> hehe

23:13 <HdkR> Determine how big the icache is, how much it can prefetch. Modify the blend shader patchpoint itself

23:14 HdkR was kicked from #panfrost by alyssa [Heresy]

23:14 <alyssa> :P

23:14 HdkR has joined #panfrost

23:14 <alyssa> <3

23:14 <HdkR> lol

23:14 tgall_foo has quit [Read error: Connection reset by peer]

23:15 <HdkR> My cute idea is only viable if you 1) Can read/write the blend shader memory space, 2) Can't pass things that aren't output colours to it

23:17 <HdkR> Something that would clear up a lot of this for me. How does the fragment stage and blend stage pass data between them?

23:17 <HdkR> fixed size sram or something?

23:20 <alyssa> I'm not sure yet

23:20 <HdkR> Does it appear as a specialized write in the fragment shader?

23:20 <alyssa> It's the same write used for the fixed function blending pass

23:21 <HdkR> Some sort of indexed store?

23:21 <alyssa> Not indexed, just a store

23:21 <alyssa> well

23:21 <alyssa> a branch but I digress

23:21 <HdkR> some sort of auto post-decrement for choosing which channel it ends up in?

23:22 <alyssa> ryan i have no idea how they wired it up, I'm not psychic :p

23:22 <HdkR> Effectively disallowing writing results out of order?

23:23 <HdkR> So if you only write alpha it still has to first pass in rgb? :P

23:26 <HdkR> (Sometimes I'm exhausting)

23:41 <alyssa> ...hm