#panfrost on 2019-02-04 — irc logs at freenode.irclog.whitequark.org

2018-12-27 00:26 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - Transientification is terminating. Memory reductions in progress.

00:40 <alyssa> Wonder if I should push my monochromatic thing to planet.fd.o

00:40 <alyssa> It's not Panfrost but it is graphics so

01:40 stikonas has quit [Remote host closed the connection]

01:47 * alyssa looks into sign()

01:53 <alyssa> Some wackiness

01:53 <HdkR> Nice function for getting a multiplier depending on sign-ness

01:54 <alyssa> HdkR: I meant wackiness in midgard :p

01:54 <HdkR> ah

01:55 <HdkR> a couple conditional selections I presume?

01:55 <alyssa> Not really

01:55 <HdkR> minmax?

01:55 <alyssa> A pair of unknown ops

01:55 <alyssa> sign(x) = -alu_op_2E(-alu_op_18(x, inf), -0.0f)

01:55 <alyssa> If you care to explain how a negative zero snuck in there, I'm listening ;P

01:56 <HdkR> Could use the sign of the constant to choose which direction it wants to go

01:58 <HdkR> two instructions to emulate it isn't bad though

01:58 <HdkR> Nvidia does 3 or 10 depending :P

01:59 <alyssa> I mean

01:59 <alyssa> The obvious explanation for - zero is that we're indexing the sign bit

01:59 <HdkR> yep

01:59 <alyssa> But I don't know that I buy it, since this works on fp32s and the constants are encoded as fp16

01:59 <HdkR> Could just be a derp

01:59 <alyssa> And these are floating point ops, not integer ops

02:00 <HdkR> Same ops if it is a float sign versus integer sign?

02:00 * alyssa checks

02:02 <alyssa> No matching overload for function 'sign'

02:02 <HdkR> #version 300 es?

02:02 <alyssa> Yupyup

02:03 <HdkR> ...That's a bug in their shader compiler then

02:03 <HdkR> Just checked spec and it exists

02:03 <alyssa> HdkR: https://www.khronos.org/registry/OpenGL-Refpages/es3.0/html/sign.xhtml

02:03 <HdkR> `genIType sign (genIType x)`

02:04 <alyssa> Second form is only there for 300

02:04 <HdkR> Yea, that's why I asked for `#version 300 es` :P

02:05 <HdkR> were you assigning to a float still? Since it returns integer now

02:05 <alyssa> For the integer sign function, it's simple min/max

02:06 <alyssa> Note: we know how fmin/fmax are, and it's not that

02:06 <HdkR> Curious

02:07 <alyssa> Though the ops do look superficially similar

02:07 <alyssa> fmin = 0x28

02:07 <alyssa> fmax = 0x2C

02:07 <alyssa> unknown 1 = 0x18

02:07 <alyssa> unknown2 = 0x2E

02:07 <alyssa> The "8"s are probably just coincidence, the latter pair could be saying something tho

02:07 <HdkR> Maybe an op that explicitly ignore nans?

02:07 <HdkR> :shruggie:

02:08 <alyssa> hm

02:08 <alyssa> Kind of wishing I had compute shaders ;P

02:08 <HdkR> Output them locally and see what they return? :D

02:08 <alyssa> Kind of a complex process but aight, here goes

02:09 <HdkR> Woo RE

02:09 <alyssa> No I just

02:09 <alyssa> Have a convoluted dev environmnet

02:10 <alyssa> Rebuilding mesa intensifies

02:11 marcodiego has quit [Ping timeout: 250 seconds]

02:11 <HdkR> Intense!

02:21 <alyssa> Well, this is... something

02:21 <alyssa> Our two unknown ops work exactly like fmin/fmax

02:21 <alyssa> So we have

02:21 <alyssa> sign(x) = -max(-min(x, inf), -0.0f)

02:22 <alyssa> But why on earth would you do min(x, inf)

02:22 <alyssa> But hypothetically, uh

02:22 <alyssa> sign(1) = -max(-min(1, inf), -0.0f) = -max(-1, -0) = -(-0) = 0

02:23 <alyssa> sign(0) = -max(-min(0, inf), -0.0) = -max(-0, -0) = -(-0) = 0

02:24 <alyssa> sign(-1) = -max(-min(-1, inf), -0.0) = -max(-(-1), -0.0) = -max(1, -0.0) = -1

02:24 <alyssa> :blink:

02:24 <HdkR> weird, sign(1) should return 1

02:25 <alyssa> Maybe it has to do with handling of negatives

02:25 <alyssa> Like, our new min choose based on smaller absolute value, rather than smaller value

02:25 <alyssa> I.e.

02:25 <HdkR> hm, min/max ignoring sign

02:26 <alyssa> sign(1) = -max(-min(1, inf), -0.0) = -max(-1, -0.0) = -(-1) = 1

02:26 <alyssa> sign(0) = -max(-min(0, inf), -0.0) = -max(-0, -0) = -(-0) = 0

02:26 <alyssa> sign(-1) = -max(-min(-1, inf), -0.0) = -max(-(-1), -0.0) = -max(1, -0.0) = -1

02:26 <HdkR> Would make sense

02:27 <alyssa> It's self-consistent, at least

02:27 <alyssa> And explains why there are multiple ops

02:27 <alyssa> What I don't quite get is what the min(x, inf) is for

02:28 <alyssa> Is it enough to do, uh

02:28 <alyssa> -max(-x, -0)

02:28 <alyssa> sign(1) = -max(-1, -0) = -(-1) = 1

02:28 <alyssa> sign(0) = -max(-0, -0) = -(-0) = 0

02:28 <alyssa> sign(-1) = -max(-(-1), -0) = -max(1, -0) = -1

02:29 <alyssa> What are they possibly trying to do with min(x, inf)

02:29 <alyssa> You can't have greater than infinity :P

02:32 <alyssa> Unless it's like

02:33 <alyssa> Erm wait

02:33 <alyssa> sign(2) = -max(-min(2, inf), -0.0) = -max(-2, -0) = -(-2) = 2

02:33 <alyssa> Not helpful :p

02:34 <alyssa> In the above formulation of min/max, sign(x) is just the identity function, so something more is at play

02:34 * alyssa laments lack of compute shaders, grah

02:35 <HdkR> SSBOs would work as well

02:36 <alyssa> Bah

02:37 <alyssa> But yeah, with fancy max,

02:37 <alyssa> max(in, 0.2)

02:38 <alyssa> Er

02:38 <alyssa> max(in, -0.2)

02:38 <alyssa> = in when abs(in) > 0.2, -0.2 otherwise

02:39 <alyssa> Similarly

02:39 <alyssa> max(-in, 0.5)

02:39 <alyssa> = -in when abs(in)> 0.5, 0.5 otherwise

02:50 <alyssa> Of course, recreating the pair of magic instructions in GLSL does do a fsign, but..

02:50 <alyssa> But why? :P

02:50 <HdkR> :D

02:51 <alyssa> Alright, with 0 < in < 1

02:51 <alyssa> min(in, 1.0/0.0) is infinite

02:52 <alyssa> Wat

02:53 <alyssa> Whaa?

02:53 <alyssa> Is min doing a... multiply?

02:53 <alyssa> It is definitely doing a multiply

02:54 <alyssa> --------And suddenly the mysteries start making sense

02:54 <alyssa> Okay, so,

02:54 <alyssa> min(a, b)

02:54 <alyssa> where both a,b positive = a*b

02:55 <alyssa> min(-a, b) = -min(a, b) = -a*b

02:55 <alyssa> Okay what

02:55 <alyssa> how is this not just a multiply now

02:57 <alyssa> K, let's suppose for a sec this is just a multiply

02:58 <alyssa> max appears to at least be an actual max

02:58 <alyssa> Yeah, max appears to be the max_abs thing

02:59 <alyssa> i.e.: max_abs(x, y) = x if abs(x)>abs(y), y otherwise

03:00 <alyssa> So, wait,

03:00 <alyssa> sign(x) = -max(-mul(x, inf), -0.0f) ??

03:01 <alyssa> sign(1) = -max(-mul(1, inf), -0.0) = -max(-inf, -0.0) = inf

03:01 <alyssa> sign(0) = -max(-mul(0, inf), -0.0) = -max(-0, -0) = 0

03:01 <alyssa> sign(-1) = -max(-mul(-1, inf), -0.0) = -max(inf, -0.0) = -inf

03:01 <alyssa> Which is fine, save for that, you know, factor of infinity :V

03:01 <alyssa> But it does mean, uh

03:02 <alyssa> sign(2) = -max(-mul(2, inf), -0.0) = -max(-inf, -0.0) = +inf

03:02 <alyssa> So one of the problems is resolved

03:03 <alyssa> I rather hypothesise that max starts also acting like a multiply in some way but not sure how

03:08 <alyssa> SDfsdlkjfds

03:20 <alyssa> This isn't even self-consistent

03:26 <alyssa> I'm emitting the same set of opcodes

03:26 <alyssa> Why is it behaving different

03:29 <alyssa> HdkR: Any brilliant ideas? :P

03:29 <HdkR> hm?

03:30 <alyssa> HdkR: When I do the above ops on here (composed the way they have it -- verified via disasm), it ends up emitting infinity, not 1

03:30 <alyssa> Wonder if there's a disams bug

03:32 <HdkR> Could be. If it is the exact same instructions then why would it be different? :P

03:50 _whitelogger has joined #panfrost

04:09 <alyssa> HdkR: So I feel pretty good that the inner op could be "multiply, with 0*inf = 0"

04:10 <alyssa> (Whereas the main multiply op would NaN, I think)

04:11 <HdkR> I could see that

04:11 <alyssa> HdkR: What's sign(-0) anyway

04:11 <alyssa> 0 or -1?

04:12 <HdkR> 0 I think

04:12 <alyssa> K

04:16 <alyssa> HdkR: ...Of course, these are not things I can test without COMPUTE SHADERS

04:16 <alyssa> :V

04:17 <alyssa> I mean or floating-point render targets but

04:19 <HdkR> Yea, once you get compute then testing of results is a lot easier since you can stuff everything in to SSBOs

05:51 hanetzer has quit [Ping timeout: 244 seconds]

07:31 yann has quit [Ping timeout: 245 seconds]

07:47 <davidlt> I got more picture of Pinebook Pro, incl. close ups of PCB with WiFi chip

07:48 <davidlt> alyssa, and others, Pine guys said that they could give you Pinebook Pros if you want it for development. You just need to write them.

07:48 <davidlt> I think, I have the contacts somewhere if you want.

07:56 ph5 has quit [Quit: bye]

08:50 indy has quit [Quit: ZNC - http://znc.sourceforge.net]

08:53 indy has joined #panfrost

09:11 ph5 has joined #panfrost

09:54 BenG83 has joined #panfrost

10:18 raster has joined #panfrost

10:25 <raster> nyan

10:35 raster has quit [Remote host closed the connection]

11:11 Elpaulo has quit [Quit: Elpaulo]

11:12 Elpaulo has joined #panfrost

13:03 <HdkR> https://paste.fedoraproject.org/paste/v8iC05wGOvdF8aS9yug I might be having fun

13:07 afaerber has joined #panfrost

13:26 raster has joined #panfrost

13:59 <mmind00> HdkR: but it looks like you don't want to share your fun: "Paste not found" :-P

14:11 <HdkR> https://paste.fedoraproject.org/paste/v~K-8iC05wGOvdF8aS9yug Oops. I didn't realize the link had a tilde in it and my GNU screen setup eats them.

14:20 tgall_foo has quit [Ping timeout: 246 seconds]

14:42 yann has joined #panfrost

15:42 <alyssa> davidlt: Alright, thank you :)

15:58 <alyssa> HdkR: Oh dear, what've you done

16:05 jernej has joined #panfrost

17:06 yann has quit [Ping timeout: 246 seconds]

17:38 ph5 has quit [Quit: bye]

17:42 BenG83 has quit [Quit: Leaving]

17:55 stikonas has joined #panfrost

18:06 ph5 has joined #panfrost

18:06 raster has quit [Remote host closed the connection]

18:53 afaerber has quit [Quit: Leaving]

19:30 <HdkR> alyssa: Having fun of course

19:31 yann has joined #panfrost

19:39 <HdkR> Was actually curious about what all was required to kick off the beginnings of a vulkan driver in mesa. Definitely some duplication of scripts and things from anv

19:39 <HdkR> Which is what radv did as well

20:06 <HdkR> Also compute is like a core feature of Vulkan. Might be nice for testing things

20:06 <HdkR> :P

20:06 <HdkR> Zink is an interesting prospect as well

20:08 belgin has joined #panfrost

20:19 <robclark> alyssa, btw, did you look at intel genxml? iirc it was generating bitpacked structs, so might be a better fit for you? If not, that kinda sucks, I might still be tempted to invent something to autogen encoding vs decoding from to avoid keeping both in sync..

20:29 belgin has quit [Quit: leaving]

20:36 <Lyude> robclark: what's this about it you don't mind me asking?

20:36 * robclark just mixing mediums and replying to email on irc :-P

20:37 <robclark> re: somehow generating cmdstream encode and decode from single hw db

20:37 <robclark> (ie. xml or whatever you care to invent.. although genxml and rnndb both use xml and that more or less seems to work ok)

20:37 tgall_foo has joined #panfrost

20:40 <cwabbott> robclark: I don't know too much about the intel thing, but the tricky thing about it is that there are lots of variable-length structs

20:41 <cwabbott> and of course, everything is done with structs pointing to other structs instead of a single cmdstream

20:42 <robclark> I guess the question is whether something can't be represented as structs.. packed structs vs what rnndb gets rid of the restriction that things are packed in multiple of 32b, which seems useful for you.. but not sure if that is all you need

20:42 <cwabbott> they're usually aligned to at least 16 bytes iirc

20:43 <cwabbott> there's the framebuffer struct, where a bit set somewhere means that there's a whole section between the main struct and the array of render targets

20:43 <robclark> I mean't whether some field can span dwords, mostly.. which is awkward w/ envytools/rnndb.

20:43 <cwabbott> oh, I've never seen that yet

20:44 <cwabbott> I suspect they use actual C structs in their driver

20:44 <cwabbott> everything is always aligned

20:44 <cwabbott> gotta have dat cmdstream building efficiency!

20:45 <robclark> anyways, when I come across a new gen, I tend to go thru many iterations of updating xml and re-running decoder + regen headers when I'm debugging things.. so keeping the two sides in sync *somehow* seems like a useful thing..

20:46 <robclark> maybe there is something for dealing w/ network protocols that would be a better fit, idk..

20:46 <krh> if you think they're using C structs, you should take a look at genxml

20:46 <krh> it started as "lets use structs and bitfields for the intel command stream"

20:47 <krh> and then it turned out that compilers still generate terrible code for bitfields

20:47 <krh> what genxml gives you is autogenerated "templace structs" that you fill out, then pass to an autogenerated pack function that then shifts and masks the values into place

20:48 <krh> this sounds slow, but it generates about as good code as manual shifting and or'ing the values together, since the compiler propagates the values from the template struct

20:50 <krh> as a bonus you can compile it in debug mode, which gives you a place to hook in range checks (where bitfields silently truncate), valgrind checks or even automatic conversion from float to, say, fixed point S8.2 or whatever for linewidth

20:51 <krh> it doesn't handle the "if bit is set, add another optional struct" case, but it's easy enough to handle that by hand

20:53 <krh> ah, I actually wrote a little essay about it: https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/intel/genxml/README

22:08 <HdkR> Alright. pushing this vulkan code to a local branch and ignoring. It'll turn in to a full time job if I attempt that

22:10 <HdkR> Only 20k-40k lines of code seemingly from anv and radv :P

23:01 <HdkR> alyssa: Congrats on commit access

23:03 <alyssa> HdkR: Fun indeed. :P

23:03 <alyssa> krh: Wouldn't it be a better idea just to, you know, optimize gcc or whatever? :P

23:04 <alyssa> Oh wait, this is gcc we're talking about. Never mind, understood ;)

23:14 <Lyude> wait, commit access

23:14 <Lyude> you aren't talking about mesa commit access are you? :)

23:17 <HdkR> Mesa commit access woo

23:17 <Lyude> oh hell yeah! so that means panfrost is upstream now too doesn't it?

23:17 <HdkR> I presume once it is actually pushed yes :P

23:19 <alyssa> Yeah, need to do the actual push. Guess that'll happen today :)

23:58 <HdkR> alyssa: We can start creating a Vulkan driver so we can use Zink right? :)

23:58 <Lyude> what is zink?

23:59 <HdkR> OpenGL emulation over Vulkan

23:59 <HdkR> It's kusma's side project

23:59 <alyssa> HdkR: I mean

23:59 <alyssa> I'm sticking with Gallium :P

23:59 <alyssa> Until Zink and Vulkan-on-Mesa mature

23:59 <HdkR> hehe