#panfrost on 2020-06-30 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:15 dimenus has joined #panfrost

00:34 dimenus has quit [Ping timeout: 240 seconds]

01:03 stikonas has quit [Remote host closed the connection]

01:14 dimenus has joined #panfrost

02:04 vstehle has quit [Ping timeout: 256 seconds]

02:11 dimenus has quit [Quit: WeeChat 2.8]

02:18 icecream95 has joined #panfrost

03:17 <icecream95> alyssa: https://gitlab.freedesktop.org/icecream95/mesa/-/commits/vulkan-1

03:22 <chrisf> icecream95: do you *really* want all the gallium machinery?

03:34 davidlt has joined #panfrost

04:06 buzzmarshall has quit [Remote host closed the connection]

04:07 <icecream95> chrisf: Thanks for volunteering - we'll be expecting an initial driver from you next week.

04:12 icecrea105 has joined #panfrost

04:15 icecream95 has quit [Ping timeout: 260 seconds]

04:19 <HdkR> That's a nice timeline for Vulkan driver bringup

04:21 kaspter has quit [Quit: kaspter]

04:27 <chrisf> icecrea105, heh

04:48 Lyude has quit [Ping timeout: 256 seconds]

04:50 icecrea105 has quit [Quit: leaving]

04:51 icecream95 has joined #panfrost

05:00 vstehle has joined #panfrost

05:41 Lyude has joined #panfrost

06:08 guillaume_g has joined #panfrost

06:44 indy has joined #panfrost

07:00 <tomeu> chrisf: wait, you ran make install from the kernel source dir?

07:10 <bbrezillon> icecream95, chrisf: not sure if that helps, but I had starting a vulkan driver for panfrost (well, actually it was more a skeletton for a driver, than an actual driver) => https://gitlab.freedesktop.org/bbrezillon/mesa/-/tree/panfrost-vk-experiments

07:11 <bbrezillon> that's not to say you should start from there, but feel free to pick you think might help you

07:12 <bbrezillon> *pick anything you think might help

07:15 <tomeu> icecream95: do you have already some kind of plan about how development is going to continue?

07:21 raster has joined #panfrost

07:31 shadeslayer has quit [Quit: The Lounge - https://thelounge.chat]

07:31 tomeu has quit [Quit: The Lounge - https://thelounge.chat]

07:31 ndufresne has quit [Quit: The Lounge - https://thelounge.chat]

07:32 shadeslayer has joined #panfrost

07:32 ndufresne has joined #panfrost

07:32 tomeu has joined #panfrost

07:34 yann has joined #panfrost

07:48 nlhowell has joined #panfrost

08:38 l-as has quit [Quit: killed]

08:38 Ke has quit [Quit: killed]

08:50 l-as has joined #panfrost

09:11 Ke has joined #panfrost

09:17 nlhowell has quit [Ping timeout: 256 seconds]

09:20 stikonas has joined #panfrost

09:24 <icecream95> tomeu: The first step would be reading (and ignoring) the comment "NOT FOR HARDWARE DRIVERS NEVER WILL BE"

09:27 nlhowell has joined #panfrost

09:29 paulk-leonov has quit [Ping timeout: 272 seconds]

09:29 guillaume_g has quit [Quit: Konversation terminated!]

09:30 guillaume_g has joined #panfrost

09:30 paulk-leonov has joined #panfrost

09:36 nlhowell has quit [Ping timeout: 246 seconds]

09:40 guillaume_g has quit [Quit: Konversation terminated!]

09:41 guillaume_g has joined #panfrost

10:10 icecream95 has quit [Ping timeout: 264 seconds]

10:40 tomboy65 has quit [Ping timeout: 240 seconds]

10:43 tomboy65 has joined #panfrost

10:49 tomboy65 has quit [Ping timeout: 240 seconds]

10:51 tomboy65 has joined #panfrost

11:25 kaspter has joined #panfrost

11:32 kaspter has quit [Quit: kaspter]

11:32 kaspter has joined #panfrost

11:54 nlhowell has joined #panfrost

12:01 guillaume_g has quit [Quit: Konversation terminated!]

12:14 guillaume_g has joined #panfrost

12:22 cwabbott_ has joined #panfrost

12:24 cwabbott has quit [Ping timeout: 272 seconds]

12:24 cwabbott_ is now known as cwabbott

12:41 <chrisf> tomeu, is possible that was a mistake -- but it does populate the /boot with the bits that look necessary.

12:43 * chrisf is far more comfortable with shader compiler guts than fiddling with getting the board to work

12:55 <tomeu> yeah, I also hate that

12:55 <tomeu> not sure your u-boot knows it has to mount the disk and look in /boot

12:56 <chrisf> the existing u-boot (hardkernel's one) was loading an uncompressed image, dtb, and uinitrd from this disk

12:56 <tomeu> mmind00: is there an easy way for chrisf to update the kernel in his go advance if he doesn't have a serial console?

12:58 <mmind00> tomeu: not sure ... like on my Go, I cleared the vendor uboot from the spi, put one on the sd-card and just am using extlinux to load a kernel also from there

12:58 <mmind00> tomeu: so I don't really know how the procedure _with_ the vendor uboot is

12:59 <tomeu> chrisf: and what happens when the board boots?

12:59 <chrisf> tomeu, in the failure case?

13:00 <tomeu> chrisf: yep

13:00 <chrisf> the uboot splash stays up, and nothing appears to happen

13:01 <chrisf> i imagine with the console i'd see it upset about something

13:02 <tomeu> hmm, I think I saw some complaint in the splash when it wasn't able to find the kernel

13:02 <chrisf> i *have* confirmed that it's using the uboot on the sd-card. in another experiment i rebuilt that and i could see my one was running.

13:02 <chrisf> there

13:02 <tomeu> guess you have the display driver built-in in the kernel?

13:02 <chrisf> 's a generic "system failure" complaint in the splash which tells you nothing useful

13:03 <tomeu> ah, guess a custom u-boot could make it easier to figure out

13:03 <chrisf> i was trying to get the uboot netconsole working so i could see what it was doing

13:11 * alyssa popcorns

13:11 <alyssa> daniels: when is rk3399 expected back up?

13:11 <daniels> alyssa: I was told 'a few hours'

13:12 <daniels> alyssa: so probably somewhere between 2-4h from now?

13:13 <daniels> (Vivek is replacing switches, recabling, and reconfiguring the network, so it should be a lot more stable and hopefully also faster - at some stage it's also going to get nicely sharded so we don't lose an entire class of test devices from network/switch/power/USB/rack/... outages)

13:13 <alyssa> gotcha!

13:23 <tomeu> chrisf: no easy way to get a serial cable?

13:26 <chrisf> tomeu, last part i need *should* arrive today

13:26 <tomeu> nice :)

13:27 <chrisf> assuming the uart connected to header along the top actually works :)

13:29 <chrisf> on a completely different tack, for vulkan -- it seems there's a few ugly things about the hw that complicate a very cheap mapping

13:30 <alyssa> chrisf: oh?

13:30 <alyssa> blending for one :p

13:30 <chrisf> i think that's actually not the end of the world -- we get to build a monolithic pipeline object which can contain the blend shader if we need one

13:31 <alyssa> so what is the end of the world?

13:31 <alyssa> min/max index?

13:31 <chrisf> ideally we could go all the way to the hardware descriptor structures during command buffer recording

13:33 <chrisf> min/max index is kinda gross, yes

13:34 <chrisf> but the job descriptor headers are mutated by the hw for status etc -- how to deal with a command buffer being submitted multiple times?

13:35 <alyssa> oh, oof

13:35 <alyssa> in GL we re-emit the headers (and payloads)

13:36 <alyssa> in practice it's not the *worst* thing since all the interesting bits are in other descriptors pointed around

13:36 <alyssa> so the actual main job descriptor header+payload isn't as large as it would be in a more conventional architecture

13:36 <alyssa> but yeah it's ugly

13:37 nlhowell has quit [Ping timeout: 260 seconds]

13:39 <chrisf> also what to do with secondary command buffers, where we record a bunch of draws to later include in a primary, but dont necessarily know what render targets will be in use

13:40 <chrisf> afaik the blob doesnt even try there -- they defer everything, and inline secondaries into the stream of stuff they produce at the last moment at submission time

13:44 nlhowell has joined #panfrost

13:51 <chrisf> the job resubmission thing is why vulkan has explicit one-time submission and simultaneous use flags on its command buffers

13:51 <chrisf> so if the app says it doesnt need the fully general thing you can do something less weird

13:55 kaspter has quit [Remote host closed the connection]

13:55 kaspter has joined #panfrost

13:56 <chrisf> icecream95: had you given this stuff any thought yet?

13:57 <chrisf> i suppose if you do a gallium state tracker then you dont have to deal with a lot of it, but you burn a bunch of cpu vs a direct mapping

14:00 <bbrezillon> chrisf: I had, and IIRC, the conclusion was that we need to have templates for some of those descs and re-emit them (some descs can be kept around, like textures)

14:02 <chrisf> bbrezillon: for resubmission, for secondaries, or both?

14:04 <bbrezillon> unfortunately I don't remember

14:05 <bbrezillon> I guess it was both

14:05 <chrisf> bbrezillon: my real mission on this thing is to beat the pants off the blob on cpu overhead.. but that's going to be a long road :)

14:11 <bbrezillon> chrisf: I guess we can start with a sub-optimal solution involving a lot of CPU -> GPU copies, and see how we can improve that afterwards

14:12 <alyssa> but yes, definitely go for a 'real' vk driver

14:12 <bbrezillon> there's a lot of plumbing to do before we can even run a simple VK prog

14:12 <alyssa> (i.e. without depending on Gallium)

14:13 nlhowell has quit [Quit: WeeChat 2.8]

14:14 nlhowell has joined #panfrost

14:14 <chrisf> bbrezillon: oh yes :)

14:15 <chrisf> bbrezillon: i just got done implementing it in software, well aware of the amount of plumbing :)

14:15 <bbrezillon> chrisf: if you don't want to start from scract, you can check my branch

14:15 <bbrezillon> but it's far from functional

14:32 <tomeu> chrisf: oh, ANGLE?

14:32 Ntemis has joined #panfrost

14:33 <chrisf> tomeu: swiftshader

14:33 <tomeu> ah, mixed the two

14:33 <tomeu> guess then that panfrost will be a walk in the park for you :p

14:34 <chrisf> tomeu: on the vulkan side, sure. mali is still plenty weird though ;)

14:34 <tomeu> we have alyssa for that :)

14:40 raster has quit [Remote host closed the connection]

14:40 raster has joined #panfrost

14:43 <chrisf> unrelated -- do we have any idea how the idvs jobs work?

14:43 <chrisf> this was supposed to be a big deal on bifrost

14:43 raster has quit [Client Quit]

14:45 raster has joined #panfrost

14:54 indy has quit [Ping timeout: 265 seconds]

15:00 buzzmarshall has joined #panfrost

15:02 indy has joined #panfrost

15:13 <cwabbott> chrisf: sounds like a fun project... and yeah, needing the index bounds is a big ouch there... best case you have to patch on command submission (because you don't know if it's been touched by the CPU before then) and worst case you have to emit a compute shader on-the-fly to calculate it or something

15:13 <chrisf> cwabbott: the blob emits a compute job to do it

15:14 <cwabbott> makes sense I guess

15:14 <cwabbott> the user could be evil and record two command buffers, one that writes to a buffer and one that uses the buffer as an index buffer, and submit them at the same time

15:15 <chrisf> cwabbott: super evil case: you have two draws in a single renderpass, the first has side effects to mangle the index buffer for the second.

15:15 <cwabbott> so you can't really know whether it'll get overwritten by the GPU

15:16 <cwabbott> chrisf: at least you can detect that case using dependencies

15:16 <cwabbott> and pipeline barriers, if it's between passes

15:16 <chrisf> cwabbott: yes, you'll see a pipeline barrier for it

15:16 <cwabbott> but if it's in a separate cmd buffer then you won't see the pipeline barrier, is what I'm trying to say

15:17 <chrisf> the app *still* has to provide the pipeline barrier even if the dependency is across a command buffer boundary

15:17 <cwabbott> it could be in the earlier command buffer though

15:18 <chrisf> ah, i see what you're saying

15:18 <chrisf> yeah, that's ok.

15:18 <cwabbott> so I think that completely defeats your ability to calculate the bounds on the CPU

15:19 <chrisf> i think you just dont bother and do it on the GPU

15:19 <cwabbott> yeah

15:40 raster has quit [Ping timeout: 240 seconds]

15:41 raster has joined #panfrost

16:02 tomeu has quit [Quit: Ping timeout (120 seconds)]

16:02 shadeslayer has quit [Quit: Ping timeout (120 seconds)]

16:02 ndufresne has quit [Quit: Ping timeout (120 seconds)]

16:05 ndufresne has joined #panfrost

16:05 tomeu has joined #panfrost

16:05 shadeslayer has joined #panfrost

16:09 <chrisf> tomeu: anything else worth poking at before i get the serial console working?

16:15 <alyssa> cwabbott: I had not considered that case ... that is horrifying

16:15 yann has quit [Ping timeout: 240 seconds]

16:16 <alyssa> Does GL also have that issue, I guess?

16:17 <chrisf> alyssa: this is why GL grew things like DrawRangeElements

16:17 <alyssa> chrisf: I specifically meant "bind an index buffer as an SSBO and mangle it" etc

16:17 <chrisf> oh, absolutely. GL buffers are buffers for any purpose

16:18 <alyssa> I guess with our gallium stack, that would trigger a flush so it'd be fine

16:18 <chrisf> yeah, im pretty sure gallium takes care of it

16:22 <alyssa> 2020-06-30T14:43:06 dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.mat4_mat2_fragment,Fail

16:22 <alyssa> on T720 with my scheduler/RA changes, but fine on t860. wat

17:02 robmur01_ is now known as robmur01

17:17 <alyssa> Argh, okay, register spilling is broken on t720 at least under some circumstances

17:24 <alyssa> Unrelatedly, wondering if we should really just flip fp16 on

17:24 <alyssa> We still take a hit on cycle count but I suspect the win in register pressure makes up for it

17:46 <alyssa> register spilling fixed on t720 :)

18:17 <chrisf> well, i have a working serial console now... but no joy

18:25 guillaume_g has quit [Quit: Konversation terminated!]

18:32 <alyssa> :v

18:43 <HdkR> :>

18:43 <robmur01> :^

18:44 <chrisf> ive already run the gamut of unimpressed facial expressions ;)

18:48 buzzmarshall has quit [Ping timeout: 260 seconds]

18:50 tomboy65 has quit [Ping timeout: 240 seconds]

18:52 tomboy65 has joined #panfrost

19:12 karolherbst has quit [Quit: duh 🐧]

19:14 karolherbst has joined #panfrost

19:27 <alyssa> so txf with msaa has the sample # in coord.z

19:31 jolan has quit [Quit: leaving]

19:32 jolan has joined #panfrost

19:32 <HdkR> alyssa: As in number of samples or sample index?

19:33 <alyssa> index

19:33 <alyssa> Okay, so I have txf_ms handled now

19:34 <HdkR> With 3D MSAA textures does it extend to w then?

19:34 <alyssa> GLES doesn't do 3D MSAA textures

19:34 <alyssa> anyway .z makes sense because Mali is treating MSAA textures as 3D textures

19:35 <alyssa> where depth = sample count

19:37 <HdkR> Ah, it needs GL_OES_texture_storage_multisample_2d_array for MSAA 2D array

19:37 <cwabbott> alyssa: yeah, the blessing/curse with vulkan is that the driver never gets the full picture... you can record various commands in parallel, and then only at the very end, after all the descriptors etc. have been generated, does the driver find out what order they're going to be executed

19:38 <alyssa> cwabbott: joy

19:47 <alyssa> Oh, even better, they literally have 4 separate layers for each sampling... ugh

19:47 <alyssa> this can't be good for perf

19:48 <alyssa> So all the 'magic' is in rendering ... fun

19:49 nlhowell has quit [Ping timeout: 260 seconds]

19:51 <cwabbott> chrisf: one thing you should think about early is how to handle renderpasses

19:52 <cwabbott> in turnip we're rather lazy at the moment, since we can always fall back to immediate-rendering mode if something goes wrong

19:53 <Lyude> HdkR: you started any work on vulkan w/ bifrost yet btw?

19:53 <HdkR> wha? Nah. that was just a weekend thing to see what the initial infrastructure work takes

19:53 <Lyude> ah

19:54 <cwabbott> so we statically divvy up the tile buffer between all the attachments, and if it runs out of space then whoops, let's just use sysmem (immediate-mode rendering) for this renderpass instead

19:54 <cwabbott> but you don't really have that luxury with mali

19:54 <HdkR> Dang Adreno, being able to cheat

19:55 <HdkR> Probably isn't even completely terrible with the platforms that have 68GB/s of memory bandwidth :P

19:55 <cwabbott> my understanding is that you're supposed to think of each subpass like an instruction and do "register allocation" on the tile buffer, including spilling

19:56 buzzmarshall has joined #panfrost

19:57 <cwabbott> and in the downstream kernel there's even some special JIT memory allocation path to handle the "oh shit, we need to allocate a giant framebuffer during command submission" case if you spill

19:57 nlhowell has joined #panfrost

19:59 <cwabbott> also there are some, err, fun rules around vertex/fragment atomics which mean that as soon as you enable that feature, you may have to split the renderpass into smaller parts

20:00 <cwabbott> (that's yet another thing we can get around in turnip by forcing sysmem)

20:01 <cwabbott> so getting the "render pass allocator" right is going to be one of the toughest part of a mali vulkan driver, I think

20:02 Ntemis has quit [Read error: Connection reset by peer]

20:05 <chrisf> cwabbott: yeah, there's definitely wrinkles there.

20:05 <chrisf> cwabbott: adreno *increasingly* gives up and uses direct mode

20:07 <chrisf> cwabbott: you're saying there should be explicit management of the tile buffer for mali?

20:07 <cwabbott> chrisf: yes

20:08 <cwabbott> there needs to be a "compiler" for render passes more-or-less

20:10 <cwabbott> that allocates tilebuffer space for each subpass, and figures out when to load/store tiles that have a non-UNDEFINED load/store op

20:13 <chrisf> ok, something im not clear on is how the load/store between the tile memory and system memory is actually done

20:13 <alyssa> chrisf: tile->system is automatic byt he hardware

20:13 <alyssa> system->tile on Midgard requires literally texturing

20:14 <alyssa> a fullscreen quad

20:14 <alyssa> on Bifrost you can set some magic fields in the FRAGMENT job for that but internally I think it's the same, we don't do this quite yet

20:16 <alyssa> cwabbott: that's horrifying

20:17 <chrisf> likely not *too* bad, and this is the tricky case. vast majority of renderpasses are single-subpass and fit

20:22 <chrisf> cwabbott: im going to start by helping out with the GL driver to get my feet wet, vulkan can come later

20:23 <alyssa> Looks like there's no support for MSAA 2x, it rounds up to 4x

20:48 <alyssa> Just passed my first MSAA test :D

20:51 <Lyude> alyssa: nice! now you can finally get your MSAA license

20:52 <alyssa> Lyude: I feel like I'm missing a joke

20:52 <Lyude> like a driver's license? :P

20:52 <HdkR> MSAA? Now everyone is going to ask for alpha to coverage support

20:53 <alyssa> HdkR: That's next.

20:53 <HdkR> Big responsibility :P

20:53 <alyssa> 🤔

20:54 <HdkR> Soon AAA games will be running on Panfrost

20:55 <alyssa> HdkR: and how do you propose to run AAA games -- built for x86_64 by and large -- on an arm64 chip to test with Panfrost?

20:55 <alyssa> ;)

20:56 <chrisf> we'll get it on android one day

20:57 <chrisf> where there are a few things you could call AAA

20:58 <HdkR> alyssa: You get panfrost up to speed, I'll get FEX up to speed, we'll meet in the middle :P

20:58 <alyssa> :D

20:58 <alyssa> I saw armv8.0 is allowed now :)

20:59 <HdkR> Kinda sorta. I'm fighting with TSO memory semantics and it is a bit disgusting

20:59 <HdkR> Basically just there for testing

21:00 <HdkR> Even ARMv8.3 can't handle all the cases of that

21:01 <HdkR> (Alignment problems)

21:02 <HdkR> x86-64 is atomic for any data stored to a cacheline, aligned or not

21:08 <HdkR> (And I think cross-cacheline just tears rather than faulting)

21:14 robclark has quit [Read error: Connection reset by peer]

21:15 robclark has joined #panfrost

21:15 krh has quit [Read error: Connection reset by peer]

21:15 krh has joined #panfrost

21:53 robmur01 has quit [Ping timeout: 246 seconds]

21:55 raster has quit [Quit: Gettin' stinky!]

22:24 <chrisf> wow, ive been out of the game so long that you've changed build systems

23:18 Ocawesome101 has joined #panfrost

23:26 <Ocawesome101> hello

23:26 <Ocawesome101> so

23:26 <Ocawesome101> i have a pinebook pro, with a Mali T860, running Panfrost

23:26 <Ocawesome101> and

23:26 <Ocawesome101> Red Eclipse (Cube 2) doesn't like running

23:26 <Ocawesome101> i'll have an apitrace shortly

23:28 <Ocawesome101> i assume it's a Panfrost issue, since the game launches but has rendering issues

23:30 <Ocawesome101> ...it'll be a while, 135m and my internet isn't fast