alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - - Logs - <daniels> avoiding X is a huge feature
<alyssa_> Okay woah woah woah wait, the blob is *not* using AFBC for the depth FBO. What?!
<HdkR> Not entirely unheard of but ow
<HdkR> I think most everyone does some form of Z compression these days
<alyssa_> HdkR: No, no, it's right
<alyssa_> For this sample, performance is *improved* by a linear z buffer (rather than AFBC)
<alyssa_> but... why?
<HdkR> Maybe the AFBC hardware compressor has a throughput limit that doesn't work well with depth
<alyssa_> HdkR: I've defn seen AFBC in other cases tho
* alyssa_ grumbles
<alyssa_> I guess I'll disable depth AFBC for now but
<alyssa_> I'm confused
<alyssa_> Regardless, glmark refract at 2400x1600 is up to 18fps. I'll take that win any day :)
<alyssa_> At 800x600 in wayland, it's hitting 57fps. Coolio :P
<alyssa_> Oh, shoot, the FBOs were only partially rendering... fixing that and perf goes back a bit, lame :p
stikonas has quit [Remote host closed the connection]
<HdkR> hah. Oops
<alyssa_> HdkR: *tries to get traces off odroid*
<HdkR> Hm?
<alyssa_> HdkR: Your odroid. I forget how to get any panwraps off it
<alyssa_> blob won't even run
<HdkR> Probably need to export DISPLAY?
<alyssa_> I tried
<alyssa_> Error: couldn't open display :0
<HdkR> Ah hm
<HdkR> Lemme check
<HdkR> Looks like X died. Probably need to run an instance or restart it
<alyssa_> HdkR: I forgot how to get root :p
<HdkR> Oh. You'd have to login under your user account and sudo from there I guess since the password of the odroid account is different
<alyssa_> ...No clue what my user account pw was either :p
<HdkR> lol. You can close w/e and ask me to restart it I guess :P
<alyssa_> ?
<HdkR> No idea if you're doing anything on it. Can just let me know what point it is safe to restart
<alyssa_> HdkR: It's safe
<HdkR> Restarting
<HdkR> Restarted and running
<alyssa_> es2gears_x11: ../src/gallium/drivers/panfrost/pan_resource.c:56: panfrost_resource_from_handle: Assertion `whandle->type == WINSYS_HANDLE_TYPE_FD' failed.
<alyssa_> Oh bloody
<HdkR> Oops. Maybe I installed to /usr :P
<HdkR> need me to ireinstall the mali-x11 package? :D
<alyssa_> HdkR: Maybe
<alyssa_> Yeah, that'll be faster than be trying to cleanup from userspace :p
<HdkR> Need to restart again
<alyssa_> It's safe
<HdkR> Hm, it's being derpy
<alyssa_> It -is- a legacy driver
* alyssa_ will probably have manually tried every bit before HdkR finishes
<HdkR> Oh. It's running wrong kernel as well. derp
<alyssa_> Nice.
* alyssa_ continues bruteforcing just as a raise against the HdkR
<HdkR> There we go. Back up and running
<alyssa_> :D
<HdkR> Forgot that I broke userspace AND kernelspace on that device
<alyssa_> Cute
<alyssa_> Do I have a not-ancient panwrap? No, of course not
<HdkR> :D
<alyssa_> Wonder if the other panwrap will work
<HdkR> The panwrap that was built on it should theoretically work :D
<alyssa_> HdkR: I refuse to use pre-2019 panwrap
<HdkR> :D
<alyssa_> Bugz, bugz, everywhere!
<alyssa_> Back in business!
<alyssa_> (Well, loosely speaking... it's not a business... and I was never really gone... :P)
<alyssa_> ...Does T600 support AFBC?
<HdkR> Might not
<alyssa_> Slightly disappointing
<HdkR> Can't remember the timeline :P
* alyssa_ debates implement .pos
<HdkR> .pos?
<alyssa_> HdkR: min(x, 0) for free with any instruction
<HdkR> Neat
<HdkR> Figured SAT would be more common than min though
pH5 has quit [Quit: bye]
cwabbott has quit [Remote host closed the connection]
pH5 has joined #panfrost
BenG83 has joined #panfrost
<bbrezillon> hm, I'm not sure I understand the concept of cores and core groups
<bbrezillon> does anyone know what's encoded in shader_present?
<bbrezillon> is it the bitmask of available shader cores per core group, or it the a bitmask encoding all available shader cores, no matter the coherency group they belong to
<bbrezillon> looking at how kbase_gpuprops_construct_coherent_groups() manipulate the ->shader_present field I'd say it's the latter, but I'm not entirely sure
jernej has quit [Remote host closed the connection]
cwabbott has joined #panfrost
<tomeu> alyssa_, robher: so, though the kernel and hw have remained functional after any fault I have seen before, when messing with mali_rt_format as alyssa said break things quite badly
<tomeu> so bad that jobs only succeed again after a reprobe
<tomeu> I have tried with soft and hard resets, but I still get the same behavior:
<tomeu> all jobs timeout and the GPU status is POWER_FAULT
<tomeu> (0x41)
<tomeu> any ideas of what else needs to be done so we can recover from that fault?
<robher> tomeu: I assume there's probably a bunch of things we have to re-init after a reset. Such as enabling power...
<tomeu> didn't see kbase doing that, though
<tomeu> so I was thinking that a reset wouldn't need power up stuff again
<robher> humm...
<robher> tomeu: did you see anholt's comment on implicit fences?
<robher> tomeu: "More importantly, I think you also have my bug of not doing implicit synchronization on buffers, which will break X11 rendering sometimes."
<robher> s/fence/sync/
<tomeu> yeah, I haven't addressed those yet
<bbrezillon> robher: I see you've removed panfrost_regs.h, can we add it back and place the GPU reg defs there so that I can put the perfmon code in panfrost_perfmon.c?
<bbrezillon> unless you prefer to have the perfmon code directly in panfrost_gpu.c
* robher grumbles
<robher> bbrezillon: I guess that depends on how long the code is. But no one else seems to like the split so I guess panfrost_regs.h will be resurrected.
<tomeu> robher: I like the idea behind the decision, but I'm slightly bothered by having a long list of defines before the actual code
<tomeu> alyssa_: do you have any ideas for coming up with a minimal job that could serve as some kind of "ping" to make sure that the GPU isn't hung?
<tomeu> that would be handy to have in igt
<bbrezillon> robher: should be between 500-1000 LoC, so nothing really big. I'm fine putting the code directly in panfrost_gpu.c if you prefer this option
<tomeu> that sounds like pretty big to me :)
<robher> bbrezillon: it's done.
<bbrezillon> robher: thx
<bbrezillon> robher: next question :)
<bbrezillon> I need to allocate a bunch of memory that will be passed to the GPU to gather perfcnt data
<bbrezillon> I thought about using a BO
<bbrezillon> this way I can reuse the panfrost_mmu_map() code
<robher> bbrezillon: Is there a question in there? That sounds fine to me.
<bbrezillon> is that the correct way to do it (note that I'll be accessing the buffer in kernel space, so I'll have to call drm_gem_vmap() after the GPU is done transfering the data to the memory)
<bbrezillon> ?
<robher> Yes, that should work.
<robher> However, we'll need to make create_bo available outside the ioctl.
<bbrezillon> yep
<bbrezillon> actually, I'll be using drm_gem_shmem_create() and not drm_gem_shmem_create_with_handle(), so I'm not sure there's a lot to share here
<robher> bbrezillon: it's the pinning of pages and dma mapping part that you need.
<bbrezillon> robher: yep, but AFAICT that part is not done in the create_bo ioctl
<robher> bbrezillon: ah yes, I did move that into panfrost_mmu_map.
shenghaoyang has joined #panfrost
<tomeu> kbase's pm code is a damned maze
<tomeu> I'm just going to assume we need to power up the cores again
<tomeu> as there's a comment that suggests it
<alyssa_> HdkR: hw can do both sat and pos, NIR can only do sat
<HdkR> alyssa_: Makes sense
<alyssa_> tomeu: Mm, yeah, it's trivial to do a clear
<alyssa_> Since that's _just_ a FRAGMENT job and a fragment MFBD
<alyssa_> bbrezillon: Oo, patches :p
<alyssa_> tomeu: To be fair, so is mesa.. :P
<tomeu> alyssa_: do you have anything that could guide me in crafting such a simple clear job?
<tomeu> could be very useful in igt
<alyssa_> tomeu: Yup!
<alyssa_> Run any app really with PANTRACE_BASE=[a folder]
<alyssa_> And then pandecode [that folder]
<alyssa_> SET_VALUE/VERTEX/TILER jobs are used for draws
<alyssa_> FRAGMENT jobs are used for, well, "run the fragment shaders and writeback the framebuffer"
<alyssa_> If you run a FRAGMENT job by itself (no shaders, no nothing), it'll just clear the framebuffer to the provided colour
<tomeu> awesome
<alyssa_> From that trace, you'll need to allocate a bunch of BOs to fill out the addresses
<alyssa_> (scratchpad, tiler_scrach_*, tiler_heap_*, framebuffer)
<alyssa_> See pan_context for how that works
<tomeu> ok, I think I could get something quickly based on that info
<alyssa_> bifrost_framebuffer and bifrost_render_target need to be sequential in memory (see the UPLOAD routine in pan_mfbd which I emailed last night)
<tomeu> i915 seems to expect a hang when submitting a zeroed buffer, hopefully it will also work here
<alyssa_> And the "framebuffer_p" pointer is to the GPU address of the beginning of those two
<alyssa_> tomeu: Pretty sure that'll DATA_INVALID_FAULT but no hang
<alyssa_> tomeu: I've only really seen a hang messing with that RT format
<alyssa_> Which incidentally you can do from the FRAGMENT-only snippet :)
<bbrezillon> robher: all BOs are Non-cached, both from the CPU and GPU perspective, right?
<alyssa_> bbrezillon: GPU caches a lot internally; we have no involvement in that
<bbrezillon> don't we have a way to mark a region non-cacheable in the MMU table?
<alyssa_> bbrezillon: I think so but you're not supposed to use it afaict :p
<bbrezillon> (I mean MMU on the GPU side, so IOMMU in that case)
<bbrezillon> ok, then I need a way to flush the GPU cache
<alyssa_> I recall UNCACHED_GPU has a big red flag "Only works on aarch64, won't work for a lot of buffers depending on what the GPU does with it, read the architecture documention [:V]" soo
<alyssa_> bbrezillon: GPU cache gets flushed automagically on job submit / end IIRC
<narmstrong> first run of drm driver on S912 (T820) : [ 2.521453] panfrost d00c0000.gpu: clock rate = 666666666
<narmstrong> [ 2.530629] panfrost d00c0000.gpu: features: 00000000,101e76ff, issues: 00000000,24040400
<narmstrong> [ 2.523028] panfrost d00c0000.gpu: mali-t820 id 0x820 major 0x1 minor 0x0 status 0x0
<narmstrong> [ 2.550462] panfrost d00c0000.gpu: shader_present=0x7
<narmstrong> [ 2.538736] panfrost d00c0000.gpu: Features: L2:0x07110206 Shader:0x00000000 Tiler:0x00000809 Mem:0x1 MMU:0x00002821 AS:0xff JS:0x7
<narmstrong> [ 2.555537] panfrost d00c0000.gpu: gpu error irq state=601 status=21
<narmstrong> [ 2.556487] [drm] Initialized panfrost 1.0.0 20180908 for d00c0000.gpu on minor 0
<alyssa_> So when the job finishes (rendering done, etc), caches are flushed out from GPU->CPU
<bbrezillon> unfortunately counters dump takes a different path => GPU_COMMAND_PRFCNT_SAMPLE
* alyssa_ blinks
<bbrezillon> so I'm not sure I can rely on the JS auto-flush
<alyssa_> bbrezillon: There *is* a sync ioctl on kbase
<alyssa_> But we don't use it for anything since the autoflush has been fine so far
<bbrezillon> I'll have a look
* alyssa_ poof
<robher> bbrezillon: currently everything is non-cached (write-combine in Linux terms). The kbase driver is inner WT by default.
<bbrezillon> I remember seeing a comment stating that in the DRM driver, couldn't find it back
<bbrezillon> alyssa_: looks like they try to map the counter dump buf GPU-uncached when possible
<robher> bbrezillon: though I'm not sure what is 'default' and what isn't as several caching modes are supported. We only support what the ARM SMMU supports.
<bbrezillon> robher: and how is GPU-side cache maintenance supposed to be done in Linux?
<bbrezillon> is it the responsibility of the iommu?
<robher> bbrezillon: every job submit does a cache flush.
<robher> which may be redundant ATM.
<bbrezillon> so, that's the auto-flush alyssa_ was mentionning, right?
<bbrezillon> okay, I see it in panfrost_job.c, but I fear I need something else for the PRFCNT_SAMPLE operation
<bbrezillon> alyssa_: looks like they do an explicit clean+invalidate when GPU-uncached mode is not supported
<bbrezillon> so that's probably something I'll have to take care of
<hanetzer> blep
<hanetzer> sup kernel dude :)
<urjaman> i just made a small arch rootfs on an usb stick, then realized i wanted it on different usb stick, and then ended up trying to clone the destination disk over the source
<hanetzer> oof
<urjaman> ah well, nothing important there ... let's pacstrap an another rootfs *sigh*
<urjaman> i'm happy that i the sd* namespace is different from the mmcblk :P
<hanetzer> yeah
<hanetzer> that is rather helpful in avoiding screwups :P
<Lyude> narmstrong: nice!
jernej has joined #panfrost
pH5 has quit [Quit: bye]
pH5 has joined #panfrost
<narmstrong> Lyude: but some work needed !
<narmstrong> [ 131.452419] panfrost d00c0000.gpu: js fault, js=1, status=JOB_CONFIG_FAULT, head=0x2e00, tail=0x2e00
<narmstrong> [ 131.966860] panfrost d00c0000.gpu: gpu sched timeout, js=1, status=0x40, head=0x2e00, tail=0x2e00
<narmstrong> [ 131.970407] panfrost d00c0000.gpu: js fault, js=0, status=JOB_CONFIG_FAULT, head=0x2f80, tail=0x2f80
<narmstrong> [ 132.478827] panfrost d00c0000.gpu: gpu sched timeout, js=0, status=0x40, head=0x2f80, tail=0x2f80
<narmstrong> [ 131.262878] panfrost d00c0000.gpu: gpu sched timeout, js=0, status=0x40, head=0x400b00, tail=0x400b00
<narmstrong> [ 132.665499] panfrost d00c0000.gpu: AS_ACTIVE bit stuck
<narmstrong> seems the DRM will also need to PWR_KEY and PWR_OVERRIDE hacks...
<narmstrong> tomeu: robher: how do you managed the panfrost drm dev ? on the fd-o gitlab or should we start on the dri-devel ML ?
<narmstrong> got rid of AS_ACTIVE bit stuck, but still have the sched timeout and js fault...
<robher> A MR for panfrost/linux or the ML is fine. Once upstream it will be ML.
<robher> narmstrong: ^^^
<narmstrong> robher: oki, will push a MR once I figure out why I have all these faults ;-)
<robher> narmstrong: so far, it's just been the 2 of us...
TheCycoONE has quit [Quit: ZNC 1.7.2 -]
TheCycoONE has joined #panfrost
shenghaoyang has quit [Read error: Connection reset by peer]
<narmstrong> well, no idea why I get these js fault...
shenghaoyang has joined #panfrost
stikonas has joined #panfrost
shenghaoyang has quit [Remote host closed the connection]
<robher> narmstrong: you are up to date with stuff I pushed in yesterday? There were some things hardcoded to t860 r2+ that I fixed to match the kbase driver (hopefully).
<robher> narmstrong: and does the kbase driver work?
<narmstrong> robher: i took the last panfrost-5.0 branch on the panfrost Linux repo and the mainline-driver from tomeu Mesa repo
<narmstrong> robher: mali kbase works with some tweaks, seems the auto mic power management does not work and soft reset neither, tried to add the same tweak to the drm driver but seems something is still missing
<narmstrong> The mali_kbase of the panfrost gitlab has all the tweaks to work
jernej has quit [Remote host closed the connection]
gtucker has quit [Ping timeout: 259 seconds]
tomeu has quit [Ping timeout: 264 seconds]
<alyssa_> robher: Just read through your new kernel changes, looks great! :)
<robher> alyssa_: let me know if I missed any review comments.
pH5 has quit [Quit: bye]