#panfrost on 2019-03-13 — irc logs at freenode.irclog.whitequark.org

2019-02-15 17:52 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

01:11 <alyssa_> Okay woah woah woah wait, the blob is *not* using AFBC for the depth FBO. What?!

01:15 <HdkR> Not entirely unheard of but ow

01:16 <HdkR> I think most everyone does some form of Z compression these days

01:17 <alyssa_> HdkR: No, no, it's right

01:17 <alyssa_> For this sample, performance is *improved* by a linear z buffer (rather than AFBC)

01:17 <alyssa_> but... why?

01:22 <HdkR> Maybe the AFBC hardware compressor has a throughput limit that doesn't work well with depth

01:26 <alyssa_> HdkR: I've defn seen AFBC in other cases tho

01:29 * alyssa_ grumbles

01:29 <alyssa_> I guess I'll disable depth AFBC for now but

01:29 <alyssa_> I'm confused

01:32 <alyssa_> Regardless, glmark refract at 2400x1600 is up to 18fps. I'll take that win any day :)

01:34 <alyssa_> At 800x600 in wayland, it's hitting 57fps. Coolio :P

01:45 <alyssa_> Oh, shoot, the FBOs were only partially rendering... fixing that and perf goes back a bit, lame :p

01:53 stikonas has quit [Remote host closed the connection]

01:53 <HdkR> hah. Oops

02:27 <alyssa_> HdkR: *tries to get traces off odroid*

02:27 <HdkR> Hm?

02:27 <alyssa_> HdkR: Your odroid. I forget how to get any panwraps off it

02:28 <alyssa_> blob won't even run

02:28 <HdkR> Probably need to export DISPLAY?

02:28 <alyssa_> I tried

02:28 <alyssa_> Error: couldn't open display :0

02:29 <HdkR> Ah hm

02:29 <HdkR> Lemme check

02:30 <HdkR> Looks like X died. Probably need to run an instance or restart it

02:30 <alyssa_> HdkR: I forgot how to get root :p

02:32 <HdkR> Oh. You'd have to login under your user account and sudo from there I guess since the password of the odroid account is different

02:32 <alyssa_> ...No clue what my user account pw was either :p

02:33 <HdkR> lol. You can close w/e and ask me to restart it I guess :P

02:33 <alyssa_> ?

02:34 <HdkR> No idea if you're doing anything on it. Can just let me know what point it is safe to restart

02:34 <alyssa_> HdkR: It's safe

02:34 <HdkR> Restarting

02:35 <HdkR> Restarted and running

02:37 <alyssa_> es2gears_x11: ../src/gallium/drivers/panfrost/pan_resource.c:56: panfrost_resource_from_handle: Assertion `whandle->type == WINSYS_HANDLE_TYPE_FD' failed.

02:37 <alyssa_> Oh bloody

02:37 <HdkR> Oops. Maybe I installed to /usr :P

02:38 <HdkR> need me to ireinstall the mali-x11 package? :D

02:38 <alyssa_> HdkR: Maybe

02:39 <alyssa_> Yeah, that'll be faster than be trying to cleanup from userspace :p

02:40 <HdkR> Need to restart again

02:40 <alyssa_> It's safe

02:44 <HdkR> Hm, it's being derpy

02:44 <alyssa_> It -is- a legacy driver

02:45 * alyssa_ will probably have manually tried every bit before HdkR finishes

02:46 <HdkR> Oh. It's running wrong kernel as well. derp

02:46 <alyssa_> Nice.

02:47 * alyssa_ continues bruteforcing just as a raise against the HdkR

02:49 <HdkR> There we go. Back up and running

02:49 <alyssa_> :D

02:49 <HdkR> Forgot that I broke userspace AND kernelspace on that device

02:50 <alyssa_> Cute

02:50 <alyssa_> Do I have a not-ancient panwrap? No, of course not

02:50 <HdkR> :D

02:51 <alyssa_> Wonder if the other panwrap will work

02:51 <HdkR> The panwrap that was built on it should theoretically work :D

02:53 <alyssa_> HdkR: I refuse to use pre-2019 panwrap

02:53 <HdkR> :D

02:54 <alyssa_> Bugz, bugz, everywhere!

02:56 <alyssa_> Back in business!

02:56 <alyssa_> (Well, loosely speaking... it's not a business... and I was never really gone... :P)

03:00 <alyssa_> ...Does T600 support AFBC?

03:01 <HdkR> Might not

03:02 <alyssa_> Slightly disappointing

03:02 <HdkR> Can't remember the timeline :P

05:03 * alyssa_ debates implement .pos

05:04 <HdkR> .pos?

05:11 <alyssa_> HdkR: min(x, 0) for free with any instruction

05:13 <HdkR> Neat

05:13 <HdkR> Figured SAT would be more common than min though

08:10 pH5 has quit [Quit: bye]

09:01 cwabbott has quit [Remote host closed the connection]

09:03 pH5 has joined #panfrost

09:58 BenG83 has joined #panfrost

10:20 <bbrezillon> hm, I'm not sure I understand the concept of cores and core groups

10:21 <bbrezillon> does anyone know what's encoded in shader_present?

10:22 <bbrezillon> is it the bitmask of available shader cores per core group, or it the a bitmask encoding all available shader cores, no matter the coherency group they belong to

10:25 <bbrezillon> looking at how kbase_gpuprops_construct_coherent_groups() manipulate the ->shader_present field I'd say it's the latter, but I'm not entirely sure

11:12 jernej has quit [Remote host closed the connection]

11:14 cwabbott has joined #panfrost

12:43 <tomeu> alyssa_, robher: so, though the kernel and hw have remained functional after any fault I have seen before, when messing with mali_rt_format as alyssa said break things quite badly

12:43 <tomeu> so bad that jobs only succeed again after a reprobe

12:43 <tomeu> I have tried with soft and hard resets, but I still get the same behavior:

12:44 <tomeu> all jobs timeout and the GPU status is POWER_FAULT

12:44 <tomeu> (0x41)

12:44 <tomeu> any ideas of what else needs to be done so we can recover from that fault?

12:47 <robher> tomeu: I assume there's probably a bunch of things we have to re-init after a reset. Such as enabling power...

12:47 <tomeu> didn't see kbase doing that, though

12:47 <tomeu> so I was thinking that a reset wouldn't need power up stuff again

12:49 <robher> humm...

12:50 <robher> tomeu: did you see anholt's comment on implicit fences?

12:51 <robher> tomeu: "More importantly, I think you also have my bug of not doing implicit synchronization on buffers, which will break X11 rendering sometimes."

12:51 <robher> s/fence/sync/

12:52 <tomeu> yeah, I haven't addressed those yet

13:04 <bbrezillon> robher: I see you've removed panfrost_regs.h, can we add it back and place the GPU reg defs there so that I can put the perfmon code in panfrost_perfmon.c?

13:05 <bbrezillon> unless you prefer to have the perfmon code directly in panfrost_gpu.c

13:10 * robher grumbles

13:11 <robher> bbrezillon: I guess that depends on how long the code is. But no one else seems to like the split so I guess panfrost_regs.h will be resurrected.

13:21 <tomeu> robher: I like the idea behind the decision, but I'm slightly bothered by having a long list of defines before the actual code

13:22 <tomeu> alyssa_: do you have any ideas for coming up with a minimal job that could serve as some kind of "ping" to make sure that the GPU isn't hung?

13:22 <tomeu> that would be handy to have in igt

13:26 <bbrezillon> robher: should be between 500-1000 LoC, so nothing really big. I'm fine putting the code directly in panfrost_gpu.c if you prefer this option

13:27 <tomeu> that sounds like pretty big to me :)

13:41 <robher> bbrezillon: it's done.

13:46 <bbrezillon> robher: thx

13:47 <bbrezillon> robher: next question :)

13:47 <bbrezillon> I need to allocate a bunch of memory that will be passed to the GPU to gather perfcnt data

13:47 <bbrezillon> I thought about using a BO

13:48 <bbrezillon> this way I can reuse the panfrost_mmu_map() code

13:50 <robher> bbrezillon: Is there a question in there? That sounds fine to me.

13:50 <bbrezillon> is that the correct way to do it (note that I'll be accessing the buffer in kernel space, so I'll have to call drm_gem_vmap() after the GPU is done transfering the data to the memory)

13:50 <bbrezillon> ?

13:51 <robher> Yes, that should work.

13:52 <robher> However, we'll need to make create_bo available outside the ioctl.

13:52 <bbrezillon> yep

13:53 <bbrezillon> actually, I'll be using drm_gem_shmem_create() and not drm_gem_shmem_create_with_handle(), so I'm not sure there's a lot to share here

13:57 <robher> bbrezillon: it's the pinning of pages and dma mapping part that you need.

14:01 <bbrezillon> robher: yep, but AFAICT that part is not done in the create_bo ioctl

14:06 <robher> bbrezillon: ah yes, I did move that into panfrost_mmu_map.

14:08 shenghaoyang has joined #panfrost

14:13 <tomeu> kbase's pm code is a damned maze

14:13 <tomeu> I'm just going to assume we need to power up the cores again

14:13 <tomeu> as there's a comment that suggests it

14:18 <alyssa_> HdkR: hw can do both sat and pos, NIR can only do sat

14:19 <HdkR> alyssa_: Makes sense

14:20 <alyssa_> tomeu: Mm, yeah, it's trivial to do a clear

14:20 <alyssa_> Since that's _just_ a FRAGMENT job and a fragment MFBD

14:21 <alyssa_> bbrezillon: Oo, patches :p

14:21 <alyssa_> tomeu: To be fair, so is mesa.. :P

14:22 <tomeu> alyssa_: do you have anything that could guide me in crafting such a simple clear job?

14:23 <tomeu> could be very useful in igt

14:25 <alyssa_> tomeu: Yup!

14:25 <alyssa_> Run any app really with PANTRACE_BASE=[a folder]

14:25 <alyssa_> And then pandecode [that folder]

14:26 <alyssa_> SET_VALUE/VERTEX/TILER jobs are used for draws

14:26 <alyssa_> FRAGMENT jobs are used for, well, "run the fragment shaders and writeback the framebuffer"

14:26 <alyssa_> If you run a FRAGMENT job by itself (no shaders, no nothing), it'll just clear the framebuffer to the provided colour

14:30 <tomeu> awesome

14:30 <alyssa_> From that trace, you'll need to allocate a bunch of BOs to fill out the addresses

14:30 <alyssa_> (scratchpad, tiler_scrach_*, tiler_heap_*, framebuffer)

14:30 <alyssa_> See pan_context for how that works

14:31 <tomeu> ok, I think I could get something quickly based on that info

14:31 <alyssa_> bifrost_framebuffer and bifrost_render_target need to be sequential in memory (see the UPLOAD routine in pan_mfbd which I emailed last night)

14:31 <tomeu> i915 seems to expect a hang when submitting a zeroed buffer, hopefully it will also work here

14:31 <alyssa_> And the "framebuffer_p" pointer is to the GPU address of the beginning of those two

14:32 <alyssa_> tomeu: Pretty sure that'll DATA_INVALID_FAULT but no hang

14:32 <alyssa_> tomeu: I've only really seen a hang messing with that RT format

14:32 <alyssa_> Which incidentally you can do from the FRAGMENT-only snippet :)

14:32 <bbrezillon> robher: all BOs are Non-cached, both from the CPU and GPU perspective, right?

14:33 <alyssa_> bbrezillon: GPU caches a lot internally; we have no involvement in that

14:34 <bbrezillon> don't we have a way to mark a region non-cacheable in the MMU table?

14:34 <alyssa_> bbrezillon: I think so but you're not supposed to use it afaict :p

14:35 <bbrezillon> (I mean MMU on the GPU side, so IOMMU in that case)

14:35 <bbrezillon> ok, then I need a way to flush the GPU cache

14:35 <alyssa_> I recall UNCACHED_GPU has a big red flag "Only works on aarch64, won't work for a lot of buffers depending on what the GPU does with it, read the architecture documention [:V]" soo

14:35 <alyssa_> bbrezillon: GPU cache gets flushed automagically on job submit / end IIRC

14:36 <narmstrong> first run of drm driver on S912 (T820) : [ 2.521453] panfrost d00c0000.gpu: clock rate = 666666666

14:36 <narmstrong> [ 2.530629] panfrost d00c0000.gpu: features: 00000000,101e76ff, issues: 00000000,24040400

14:36 <narmstrong> [ 2.523028] panfrost d00c0000.gpu: mali-t820 id 0x820 major 0x1 minor 0x0 status 0x0

14:36 <narmstrong> [ 2.550462] panfrost d00c0000.gpu: shader_present=0x7

14:36 <narmstrong> [ 2.538736] panfrost d00c0000.gpu: Features: L2:0x07110206 Shader:0x00000000 Tiler:0x00000809 Mem:0x1 MMU:0x00002821 AS:0xff JS:0x7

14:36 <narmstrong> [ 2.555537] panfrost d00c0000.gpu: gpu error irq state=601 status=21

14:36 <narmstrong> [ 2.556487] [drm] Initialized panfrost 1.0.0 20180908 for d00c0000.gpu on minor 0

14:37 <alyssa_> bbrezillon: JS_CONFIG_START_FLUSH_CLEAN_INVALIDATE and JS_CONFIG_END_FLUSH_CLEAN_INVALIDATE

14:37 <alyssa_> So when the job finishes (rendering done, etc), caches are flushed out from GPU->CPU

14:38 <bbrezillon> unfortunately counters dump takes a different path => GPU_COMMAND_PRFCNT_SAMPLE

14:38 * alyssa_ blinks

14:38 <bbrezillon> so I'm not sure I can rely on the JS auto-flush

14:38 <alyssa_> bbrezillon: There *is* a sync ioctl on kbase

14:39 <alyssa_> But we don't use it for anything since the autoflush has been fine so far

14:39 <bbrezillon> I'll have a look

14:40 * alyssa_ poof

14:42 <robher> bbrezillon: currently everything is non-cached (write-combine in Linux terms). The kbase driver is inner WT by default.

14:43 <bbrezillon> I remember seeing a comment stating that in the DRM driver, couldn't find it back

14:43 <bbrezillon> alyssa_: looks like they try to map the counter dump buf GPU-uncached when possible https://gitlab.freedesktop.org/panfrost/mali_kbase/blob/master/driver/product/kernel/drivers/gpu/arm/midgard/mali_kbase_vinstr.c#L377

14:44 <robher> bbrezillon: though I'm not sure what is 'default' and what isn't as several caching modes are supported. We only support what the ARM SMMU supports.

14:45 <bbrezillon> robher: and how is GPU-side cache maintenance supposed to be done in Linux?

14:46 <bbrezillon> is it the responsibility of the iommu?

14:46 <robher> bbrezillon: every job submit does a cache flush.

14:47 <robher> which may be redundant ATM.

14:48 <bbrezillon> so, that's the auto-flush alyssa_ was mentionning, right?

14:50 <bbrezillon> okay, I see it in panfrost_job.c, but I fear I need something else for the PRFCNT_SAMPLE operation

15:02 <bbrezillon> alyssa_: looks like they do an explicit clean+invalidate when GPU-uncached mode is not supported https://gitlab.freedesktop.org/panfrost/mali_kbase/blob/master/driver/product/kernel/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_instr_backend.c#L380

15:02 <bbrezillon> so that's probably something I'll have to take care of

15:06 <hanetzer> blep

15:06 <hanetzer> sup kernel dude :)

15:22 <urjaman> i just made a small arch rootfs on an usb stick, then realized i wanted it on different usb stick, and then ended up trying to clone the destination disk over the source

15:23 <hanetzer> oof

15:23 <urjaman> ah well, nothing important there ... let's pacstrap an another rootfs *sigh*

15:24 <urjaman> i'm happy that i the sd* namespace is different from the mmcblk :P

15:24 <hanetzer> yeah

15:24 <hanetzer> that is rather helpful in avoiding screwups :P

15:32 <Lyude> narmstrong: nice!

15:33 jernej has joined #panfrost

16:00 pH5 has quit [Quit: bye]

16:20 pH5 has joined #panfrost

16:40 <narmstrong> Lyude: but some work needed !

16:40 <narmstrong> [ 131.452419] panfrost d00c0000.gpu: js fault, js=1, status=JOB_CONFIG_FAULT, head=0x2e00, tail=0x2e00

16:40 <narmstrong> [ 131.966860] panfrost d00c0000.gpu: gpu sched timeout, js=1, status=0x40, head=0x2e00, tail=0x2e00

16:40 <narmstrong> [ 131.970407] panfrost d00c0000.gpu: js fault, js=0, status=JOB_CONFIG_FAULT, head=0x2f80, tail=0x2f80

16:40 <narmstrong> [ 132.478827] panfrost d00c0000.gpu: gpu sched timeout, js=0, status=0x40, head=0x2f80, tail=0x2f80

16:40 <narmstrong> [ 131.262878] panfrost d00c0000.gpu: gpu sched timeout, js=0, status=0x40, head=0x400b00, tail=0x400b00

16:40 <narmstrong> [ 132.665499] panfrost d00c0000.gpu: AS_ACTIVE bit stuck

16:41 <narmstrong> seems the DRM will also need to PWR_KEY and PWR_OVERRIDE hacks...

17:12 <narmstrong> tomeu: robher: how do you managed the panfrost drm dev ? on the fd-o gitlab or should we start on the dri-devel ML ?

17:14 <narmstrong> got rid of AS_ACTIVE bit stuck, but still have the sched timeout and js fault...

17:15 <robher> A MR for panfrost/linux or the ML is fine. Once upstream it will be ML.

17:15 <robher> narmstrong: ^^^

17:15 <narmstrong> robher: oki, will push a MR once I figure out why I have all these faults ;-)

17:16 <robher> narmstrong: so far, it's just been the 2 of us...

17:30 TheCycoONE has quit [Quit: ZNC 1.7.2 - https://znc.in]

17:33 TheCycoONE has joined #panfrost

17:47 shenghaoyang has quit [Read error: Connection reset by peer]

17:55 <narmstrong> well, no idea why I get these js fault...

18:02 shenghaoyang has joined #panfrost

18:10 stikonas has joined #panfrost

18:39 shenghaoyang has quit [Remote host closed the connection]

18:48 <robher> narmstrong: you are up to date with stuff I pushed in yesterday? There were some things hardcoded to t860 r2+ that I fixed to match the kbase driver (hopefully).

18:49 <robher> narmstrong: and does the kbase driver work?

18:49 <narmstrong> robher: i took the last panfrost-5.0 branch on the panfrost Linux repo and the mainline-driver from tomeu Mesa repo

18:50 <narmstrong> robher: mali kbase works with some tweaks, seems the auto mic power management does not work and soft reset neither, tried to add the same tweak to the drm driver but seems something is still missing

18:51 <narmstrong> The mali_kbase of the panfrost gitlab has all the tweaks to work

19:22 jernej has quit [Remote host closed the connection]

20:54 gtucker has quit [Ping timeout: 259 seconds]

20:55 tomeu has quit [Ping timeout: 264 seconds]

22:04 <alyssa_> robher: Just read through your new kernel changes, looks great! :)

22:07 <robher> alyssa_: let me know if I missed any review comments.

23:16 pH5 has quit [Quit: bye]