alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
dimenus has joined #panfrost
dimenus has quit [Ping timeout: 240 seconds]
stikonas has quit [Remote host closed the connection]
dimenus has joined #panfrost
vstehle has quit [Ping timeout: 256 seconds]
dimenus has quit [Quit: WeeChat 2.8]
icecream95 has joined #panfrost
<chrisf> icecream95: do you *really* want all the gallium machinery?
davidlt has joined #panfrost
buzzmarshall has quit [Remote host closed the connection]
<icecream95> chrisf: Thanks for volunteering - we'll be expecting an initial driver from you next week.
icecrea105 has joined #panfrost
icecream95 has quit [Ping timeout: 260 seconds]
<HdkR> That's a nice timeline for Vulkan driver bringup
kaspter has quit [Quit: kaspter]
<chrisf> icecrea105, heh
Lyude has quit [Ping timeout: 256 seconds]
icecrea105 has quit [Quit: leaving]
icecream95 has joined #panfrost
vstehle has joined #panfrost
Lyude has joined #panfrost
guillaume_g has joined #panfrost
indy has joined #panfrost
<tomeu> chrisf: wait, you ran make install from the kernel source dir?
<bbrezillon> icecream95, chrisf: not sure if that helps, but I had starting a vulkan driver for panfrost (well, actually it was more a skeletton for a driver, than an actual driver) => https://gitlab.freedesktop.org/bbrezillon/mesa/-/tree/panfrost-vk-experiments
<bbrezillon> that's not to say you should start from there, but feel free to pick you think might help you
<bbrezillon> *pick anything you think might help
<tomeu> icecream95: do you have already some kind of plan about how development is going to continue?
raster has joined #panfrost
shadeslayer has quit [Quit: The Lounge - https://thelounge.chat]
tomeu has quit [Quit: The Lounge - https://thelounge.chat]
ndufresne has quit [Quit: The Lounge - https://thelounge.chat]
shadeslayer has joined #panfrost
ndufresne has joined #panfrost
tomeu has joined #panfrost
yann has joined #panfrost
nlhowell has joined #panfrost
l-as has quit [Quit: killed]
Ke has quit [Quit: killed]
l-as has joined #panfrost
Ke has joined #panfrost
nlhowell has quit [Ping timeout: 256 seconds]
stikonas has joined #panfrost
<icecream95> tomeu: The first step would be reading (and ignoring) the comment "NOT FOR HARDWARE DRIVERS NEVER WILL BE"
nlhowell has joined #panfrost
paulk-leonov has quit [Ping timeout: 272 seconds]
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
paulk-leonov has joined #panfrost
nlhowell has quit [Ping timeout: 246 seconds]
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
icecream95 has quit [Ping timeout: 264 seconds]
tomboy65 has quit [Ping timeout: 240 seconds]
tomboy65 has joined #panfrost
tomboy65 has quit [Ping timeout: 240 seconds]
tomboy65 has joined #panfrost
kaspter has joined #panfrost
kaspter has quit [Quit: kaspter]
kaspter has joined #panfrost
nlhowell has joined #panfrost
guillaume_g has quit [Quit: Konversation terminated!]
guillaume_g has joined #panfrost
cwabbott_ has joined #panfrost
cwabbott has quit [Ping timeout: 272 seconds]
cwabbott_ is now known as cwabbott
<chrisf> tomeu, is possible that was a mistake -- but it does populate the /boot with the bits that look necessary.
* chrisf is far more comfortable with shader compiler guts than fiddling with getting the board to work
<tomeu> yeah, I also hate that
<tomeu> not sure your u-boot knows it has to mount the disk and look in /boot
<chrisf> the existing u-boot (hardkernel's one) was loading an uncompressed image, dtb, and uinitrd from this disk
<tomeu> mmind00: is there an easy way for chrisf to update the kernel in his go advance if he doesn't have a serial console?
<mmind00> tomeu: not sure ... like on my Go, I cleared the vendor uboot from the spi, put one on the sd-card and just am using extlinux to load a kernel also from there
<mmind00> tomeu: so I don't really know how the procedure _with_ the vendor uboot is
<tomeu> chrisf: and what happens when the board boots?
<chrisf> tomeu, in the failure case?
<tomeu> chrisf: yep
<chrisf> the uboot splash stays up, and nothing appears to happen
<chrisf> i imagine with the console i'd see it upset about something
<tomeu> hmm, I think I saw some complaint in the splash when it wasn't able to find the kernel
<chrisf> i *have* confirmed that it's using the uboot on the sd-card. in another experiment i rebuilt that and i could see my one was running.
<chrisf> there
<tomeu> guess you have the display driver built-in in the kernel?
<chrisf> 's a generic "system failure" complaint in the splash which tells you nothing useful
<tomeu> ah, guess a custom u-boot could make it easier to figure out
<chrisf> i was trying to get the uboot netconsole working so i could see what it was doing
* alyssa popcorns
<alyssa> daniels: when is rk3399 expected back up?
<daniels> alyssa: I was told 'a few hours'
<daniels> alyssa: so probably somewhere between 2-4h from now?
<daniels> (Vivek is replacing switches, recabling, and reconfiguring the network, so it should be a lot more stable and hopefully also faster - at some stage it's also going to get nicely sharded so we don't lose an entire class of test devices from network/switch/power/USB/rack/... outages)
<alyssa> gotcha!
<tomeu> chrisf: no easy way to get a serial cable?
<chrisf> tomeu, last part i need *should* arrive today
<tomeu> nice :)
<chrisf> assuming the uart connected to header along the top actually works :)
<chrisf> on a completely different tack, for vulkan -- it seems there's a few ugly things about the hw that complicate a very cheap mapping
<alyssa> chrisf: oh?
<alyssa> blending for one :p
<chrisf> i think that's actually not the end of the world -- we get to build a monolithic pipeline object which can contain the blend shader if we need one
<alyssa> so what is the end of the world?
<alyssa> min/max index?
<chrisf> ideally we could go all the way to the hardware descriptor structures during command buffer recording
<chrisf> min/max index is kinda gross, yes
<chrisf> but the job descriptor headers are mutated by the hw for status etc -- how to deal with a command buffer being submitted multiple times?
<alyssa> oh, oof
<alyssa> in GL we re-emit the headers (and payloads)
<alyssa> in practice it's not the *worst* thing since all the interesting bits are in other descriptors pointed around
<alyssa> so the actual main job descriptor header+payload isn't as large as it would be in a more conventional architecture
<alyssa> but yeah it's ugly
nlhowell has quit [Ping timeout: 260 seconds]
<chrisf> also what to do with secondary command buffers, where we record a bunch of draws to later include in a primary, but dont necessarily know what render targets will be in use
<chrisf> afaik the blob doesnt even try there -- they defer everything, and inline secondaries into the stream of stuff they produce at the last moment at submission time
nlhowell has joined #panfrost
<chrisf> the job resubmission thing is why vulkan has explicit one-time submission and simultaneous use flags on its command buffers
<chrisf> so if the app says it doesnt need the fully general thing you can do something less weird
kaspter has quit [Remote host closed the connection]
kaspter has joined #panfrost
<chrisf> icecream95: had you given this stuff any thought yet?
<chrisf> i suppose if you do a gallium state tracker then you dont have to deal with a lot of it, but you burn a bunch of cpu vs a direct mapping
<bbrezillon> chrisf: I had, and IIRC, the conclusion was that we need to have templates for some of those descs and re-emit them (some descs can be kept around, like textures)
<chrisf> bbrezillon: for resubmission, for secondaries, or both?
<bbrezillon> unfortunately I don't remember
<bbrezillon> I guess it was both
<chrisf> bbrezillon: my real mission on this thing is to beat the pants off the blob on cpu overhead.. but that's going to be a long road :)
<bbrezillon> chrisf: I guess we can start with a sub-optimal solution involving a lot of CPU -> GPU copies, and see how we can improve that afterwards
<alyssa> but yes, definitely go for a 'real' vk driver
<bbrezillon> there's a lot of plumbing to do before we can even run a simple VK prog
<alyssa> (i.e. without depending on Gallium)
nlhowell has quit [Quit: WeeChat 2.8]
nlhowell has joined #panfrost
<chrisf> bbrezillon: oh yes :)
<chrisf> bbrezillon: i just got done implementing it in software, well aware of the amount of plumbing :)
<bbrezillon> chrisf: if you don't want to start from scract, you can check my branch
<bbrezillon> but it's far from functional
<tomeu> chrisf: oh, ANGLE?
Ntemis has joined #panfrost
<chrisf> tomeu: swiftshader
<tomeu> ah, mixed the two
<tomeu> guess then that panfrost will be a walk in the park for you :p
<chrisf> tomeu: on the vulkan side, sure. mali is still plenty weird though ;)
<tomeu> we have alyssa for that :)
raster has quit [Remote host closed the connection]
raster has joined #panfrost
<chrisf> unrelated -- do we have any idea how the idvs jobs work?
<chrisf> this was supposed to be a big deal on bifrost
raster has quit [Client Quit]
raster has joined #panfrost
indy has quit [Ping timeout: 265 seconds]
buzzmarshall has joined #panfrost
indy has joined #panfrost
<cwabbott> chrisf: sounds like a fun project... and yeah, needing the index bounds is a big ouch there... best case you have to patch on command submission (because you don't know if it's been touched by the CPU before then) and worst case you have to emit a compute shader on-the-fly to calculate it or something
<chrisf> cwabbott: the blob emits a compute job to do it
<cwabbott> makes sense I guess
<cwabbott> the user could be evil and record two command buffers, one that writes to a buffer and one that uses the buffer as an index buffer, and submit them at the same time
<chrisf> cwabbott: super evil case: you have two draws in a single renderpass, the first has side effects to mangle the index buffer for the second.
<cwabbott> so you can't really know whether it'll get overwritten by the GPU
<cwabbott> chrisf: at least you can detect that case using dependencies
<cwabbott> and pipeline barriers, if it's between passes
<chrisf> cwabbott: yes, you'll see a pipeline barrier for it
<cwabbott> but if it's in a separate cmd buffer then you won't see the pipeline barrier, is what I'm trying to say
<chrisf> the app *still* has to provide the pipeline barrier even if the dependency is across a command buffer boundary
<cwabbott> it could be in the earlier command buffer though
<chrisf> ah, i see what you're saying
<chrisf> yeah, that's ok.
<cwabbott> so I think that completely defeats your ability to calculate the bounds on the CPU
<chrisf> i think you just dont bother and do it on the GPU
<cwabbott> yeah
raster has quit [Ping timeout: 240 seconds]
raster has joined #panfrost
tomeu has quit [Quit: Ping timeout (120 seconds)]
shadeslayer has quit [Quit: Ping timeout (120 seconds)]
ndufresne has quit [Quit: Ping timeout (120 seconds)]
ndufresne has joined #panfrost
tomeu has joined #panfrost
shadeslayer has joined #panfrost
<chrisf> tomeu: anything else worth poking at before i get the serial console working?
<alyssa> cwabbott: I had not considered that case ... that is horrifying
yann has quit [Ping timeout: 240 seconds]
<alyssa> Does GL also have that issue, I guess?
<chrisf> alyssa: this is why GL grew things like DrawRangeElements
<alyssa> chrisf: I specifically meant "bind an index buffer as an SSBO and mangle it" etc
<chrisf> oh, absolutely. GL buffers are buffers for any purpose
<alyssa> I guess with our gallium stack, that would trigger a flush so it'd be fine
<chrisf> yeah, im pretty sure gallium takes care of it
<alyssa> 2020-06-30T14:43:06 dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.mat4_mat2_fragment,Fail
<alyssa> on T720 with my scheduler/RA changes, but fine on t860. wat
robmur01_ is now known as robmur01
<alyssa> Argh, okay, register spilling is broken on t720 at least under some circumstances
<alyssa> Unrelatedly, wondering if we should really just flip fp16 on
<alyssa> We still take a hit on cycle count but I suspect the win in register pressure makes up for it
<alyssa> register spilling fixed on t720 :)
<chrisf> well, i have a working serial console now... but no joy
guillaume_g has quit [Quit: Konversation terminated!]
<alyssa> :v
<HdkR> :>
<robmur01> :^
<chrisf> ive already run the gamut of unimpressed facial expressions ;)
buzzmarshall has quit [Ping timeout: 260 seconds]
tomboy65 has quit [Ping timeout: 240 seconds]
tomboy65 has joined #panfrost
karolherbst has quit [Quit: duh 🐧]
karolherbst has joined #panfrost
<alyssa> so txf with msaa has the sample # in coord.z
jolan has quit [Quit: leaving]
jolan has joined #panfrost
<HdkR> alyssa: As in number of samples or sample index?
<alyssa> index
<alyssa> Okay, so I have txf_ms handled now
<HdkR> With 3D MSAA textures does it extend to w then?
<alyssa> GLES doesn't do 3D MSAA textures
<alyssa> anyway .z makes sense because Mali is treating MSAA textures as 3D textures
<alyssa> where depth = sample count
<HdkR> Ah, it needs GL_OES_texture_storage_multisample_2d_array for MSAA 2D array
<cwabbott> alyssa: yeah, the blessing/curse with vulkan is that the driver never gets the full picture... you can record various commands in parallel, and then only at the very end, after all the descriptors etc. have been generated, does the driver find out what order they're going to be executed
<alyssa> cwabbott: joy
<alyssa> Oh, even better, they literally have 4 separate layers for each sampling... ugh
<alyssa> this can't be good for perf
<alyssa> So all the 'magic' is in rendering ... fun
nlhowell has quit [Ping timeout: 260 seconds]
<cwabbott> chrisf: one thing you should think about early is how to handle renderpasses
<cwabbott> in turnip we're rather lazy at the moment, since we can always fall back to immediate-rendering mode if something goes wrong
<Lyude> HdkR: you started any work on vulkan w/ bifrost yet btw?
<HdkR> wha? Nah. that was just a weekend thing to see what the initial infrastructure work takes
<Lyude> ah
<cwabbott> so we statically divvy up the tile buffer between all the attachments, and if it runs out of space then whoops, let's just use sysmem (immediate-mode rendering) for this renderpass instead
<cwabbott> but you don't really have that luxury with mali
<HdkR> Dang Adreno, being able to cheat
<HdkR> Probably isn't even completely terrible with the platforms that have 68GB/s of memory bandwidth :P
<cwabbott> my understanding is that you're supposed to think of each subpass like an instruction and do "register allocation" on the tile buffer, including spilling
buzzmarshall has joined #panfrost
<cwabbott> and in the downstream kernel there's even some special JIT memory allocation path to handle the "oh shit, we need to allocate a giant framebuffer during command submission" case if you spill
nlhowell has joined #panfrost
<cwabbott> also there are some, err, fun rules around vertex/fragment atomics which mean that as soon as you enable that feature, you may have to split the renderpass into smaller parts
<cwabbott> (that's yet another thing we can get around in turnip by forcing sysmem)
<cwabbott> so getting the "render pass allocator" right is going to be one of the toughest part of a mali vulkan driver, I think
Ntemis has quit [Read error: Connection reset by peer]
<chrisf> cwabbott: yeah, there's definitely wrinkles there.
<chrisf> cwabbott: adreno *increasingly* gives up and uses direct mode
<chrisf> cwabbott: you're saying there should be explicit management of the tile buffer for mali?
<cwabbott> chrisf: yes
<cwabbott> there needs to be a "compiler" for render passes more-or-less
<cwabbott> that allocates tilebuffer space for each subpass, and figures out when to load/store tiles that have a non-UNDEFINED load/store op
<chrisf> ok, something im not clear on is how the load/store between the tile memory and system memory is actually done
<alyssa> chrisf: tile->system is automatic byt he hardware
<alyssa> system->tile on Midgard requires literally texturing
<alyssa> a fullscreen quad
<alyssa> on Bifrost you can set some magic fields in the FRAGMENT job for that but internally I think it's the same, we don't do this quite yet
<alyssa> cwabbott: that's horrifying
<chrisf> likely not *too* bad, and this is the tricky case. vast majority of renderpasses are single-subpass and fit
<chrisf> cwabbott: im going to start by helping out with the GL driver to get my feet wet, vulkan can come later
<alyssa> Looks like there's no support for MSAA 2x, it rounds up to 4x
<alyssa> Just passed my first MSAA test :D
<Lyude> alyssa: nice! now you can finally get your MSAA license
<alyssa> Lyude: I feel like I'm missing a joke
<Lyude> like a driver's license? :P
<HdkR> MSAA? Now everyone is going to ask for alpha to coverage support
<alyssa> HdkR: That's next.
<HdkR> Big responsibility :P
<alyssa> 🤔
<HdkR> Soon AAA games will be running on Panfrost
<alyssa> HdkR: and how do you propose to run AAA games -- built for x86_64 by and large -- on an arm64 chip to test with Panfrost?
<alyssa> ;)
<chrisf> we'll get it on android one day
<chrisf> where there are a few things you could call AAA
<HdkR> alyssa: You get panfrost up to speed, I'll get FEX up to speed, we'll meet in the middle :P
<alyssa> :D
<alyssa> I saw armv8.0 is allowed now :)
<HdkR> Kinda sorta. I'm fighting with TSO memory semantics and it is a bit disgusting
<HdkR> Basically just there for testing
<HdkR> Even ARMv8.3 can't handle all the cases of that
<HdkR> (Alignment problems)
<HdkR> x86-64 is atomic for any data stored to a cacheline, aligned or not
<HdkR> (And I think cross-cacheline just tears rather than faulting)
robclark has quit [Read error: Connection reset by peer]
robclark has joined #panfrost
krh has quit [Read error: Connection reset by peer]
krh has joined #panfrost
robmur01 has quit [Ping timeout: 246 seconds]
raster has quit [Quit: Gettin' stinky!]
<chrisf> wow, ive been out of the game so long that you've changed build systems
Ocawesome101 has joined #panfrost
<Ocawesome101> hello
<Ocawesome101> so
<Ocawesome101> i have a pinebook pro, with a Mali T860, running Panfrost
<Ocawesome101> and
<Ocawesome101> Red Eclipse (Cube 2) doesn't like running
<Ocawesome101> i'll have an apitrace shortly
<Ocawesome101> i assume it's a Panfrost issue, since the game launches but has rendering issues
<Ocawesome101> ...it'll be a while, 135m and my internet isn't fast