#panfrost on 2018-12-12 — irc logs at freenode.irclog.whitequark.org

2018-10-28 05:31 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - Discord Discard

02:23 chewitt has joined #panfrost

02:49 rhyskidd has quit [Quit: rhyskidd]

02:49 rhyskidd has joined #panfrost

03:55 <hanetzer> FINAL TOMORROW!

05:46 <tomeu> daniels: I put delays before and after submitting job batches and saw no differences, but I guess I should try as you say because it would be telling and quick

05:46 <tomeu> alyssa: those are good suggestions, I have to look at panwrap at some point but was trying to postpone the time needed to setup the blob for as long as possible

06:14 <tomeu> alyssa: btw, I'm getting error detected from slot 0, job status 0x00000058 (DATA_INVALID_FAULT) when running kmscube, but not with weston, which shows similar artifacts

06:14 <tomeu> or I think, will check next time I hack on panfrost

06:37 BenG83 has quit [Ping timeout: 240 seconds]

07:02 <cyrozap> I've been reading through the channel history since I haven't really been paying attention the last month or so, and I just stumbled across this gem: https://freenode.irclog.whitequark.org/panfrost/2018-11-09#23455749;

07:03 <cyrozap> That gave me a good chuckle :P

07:37 anarsoul|2 has quit [Ping timeout: 272 seconds]

08:02 davidlt has joined #panfrost

08:52 pH5 has quit [Quit: bye]

09:14 TheKit has quit [Remote host closed the connection]

09:14 rhyskidd has quit [Quit: rhyskidd]

09:15 rhyskidd has joined #panfrost

09:35 rhyskidd has quit [Quit: rhyskidd]

09:35 rhyskidd has joined #panfrost

09:39 TheKit has joined #panfrost

10:10 rhyskidd has quit [Quit: rhyskidd]

10:11 rhyskidd has joined #panfrost

10:16 rhyskidd has quit [Quit: rhyskidd]

10:16 rhyskidd has joined #panfrost

10:26 rhyskidd has quit [Quit: rhyskidd]

10:27 rhyskidd has joined #panfrost

10:36 <narmstrong> alyssa: I also have error detected from slot 0, job status 0x00000058 (DATA_INVALID_FAULT), but output is correct

10:44 BenG83 has joined #panfrost

10:46 <narmstrong> tomeu: I have near-zero-artifact weston now, but it blocks after a few seconds

10:46 <tomeu> narmstrong: sounds great, what did you have to do?

10:46 <narmstrong> same for kmscube, it stops after a few seconds

10:47 <tomeu> will give another look at event handling when I get back to panfrost

10:47 <davidlt> tomeu, you mean rendering or working weston?

10:48 <narmstrong> tomeu: (It may be a shame) I backported all the panfrost patches on top of the lima-18.3 branch and took your "Don't emit framebuffer at context creation time", "Use struct base_jd_event_v2 when handling events" and "increment correctly kbase_ioctl_job_submit.addr array pointer" fixes

10:48 <narmstrong> davidlt: both

10:48 <davidlt> nice, that's a nice progress even for 2 seconds :)

10:49 <narmstrong> tomeu: I created a monster, but I will be able to run Kodi with it (Lima renders perfectly with kodi)

10:49 <tomeu> I'm a bit confused, what has lima to do with this?

10:49 <narmstrong> davidlt: sure ! It's a great move, considering we don;'t have a binary libMali for the t820

10:50 <narmstrong> tomeu: nothing, but it doesn't use mesa master branch, but the 18.3 release, it may be related

10:52 rhyskidd has quit [Quit: rhyskidd]

10:52 rhyskidd has joined #panfrost

10:53 <narmstrong> davidlt: but I can't run es2gears_wayland or glmars2-es2-wayland, I get an error initializing EGL

10:53 <tomeu> ok, then I guess I won't be able to directly help with it

10:53 <tomeu> narmstrong: where is the process blocked?

10:54 <narmstrong> tomeu: no idea, I need to figure that out, when blocked weston can't be killed and stays defunct, but it's the same behaviour as with kmscube, I see a bunch of `TODO panfrost_flush_resource` then nothing

10:55 <tomeu> ok, if it's in force_flush_fragment, then we probably need to expand it to handle more events, like I did with BASE_JD_EVENT_JOB_INVALID

11:18 <narmstrong> ok, it's the `error detected from slot 0, job status 0x00000058 (DATA_INVALID_FAULT)` that causes the blocking

11:18 <narmstrong> And I can't even stop GDB in this case

11:21 <tomeu> if you attach to the blocked process with gdb -p <pid>, you cannot get a backtrace?

11:27 rhyskidd has quit [Quit: rhyskidd]

11:28 rhyskidd has joined #panfrost

11:33 rhyskidd has quit [Quit: rhyskidd]

11:34 rhyskidd has joined #panfrost

11:39 <chewitt> tomeu: the backport to 18.3 is largely my request .. we are proving/testing lima on other hardware from the same OS build-system so the lima/panfrost patch-sets need to coexist

11:40 <chewitt> using 'stable' releases as the base is also lots easier when working with large patch-sets

11:59 rhyskidd has quit [Quit: rhyskidd]

11:59 rhyskidd has joined #panfrost

12:19 raster has joined #panfrost

12:31 <narmstrong> tomeu: no it’s stuck in the kernel, it’s a wait_event_killable... I’m trying to switch to an interruptible wait so I can backtrace

12:32 <narmstrong> But the gpu fault is the main issue ! And I have really no idea how to solve it

12:33 <narmstrong> I’m not sure Alyssa worked a lot on this kbase driver version, so maybe it was fixed in the older version

12:36 <narmstrong> I got it ! https://www.irccloud.com/pastebin/r9pOFzFp/

12:39 rhyskidd has quit [Quit: rhyskidd]

12:39 rhyskidd has joined #panfrost

12:41 raster has quit [Ping timeout: 268 seconds]

12:42 raster has joined #panfrost

13:19 rhyskidd has quit [Quit: rhyskidd]

13:20 rhyskidd has joined #panfrost

13:25 rhyskidd has quit [Quit: rhyskidd]

13:25 rhyskidd has joined #panfrost

13:31 BenG83 has quit [Quit: Leaving]

13:32 BenG83 has joined #panfrost

13:35 rhyskidd has quit [Quit: rhyskidd]

13:35 rhyskidd has joined #panfrost

13:55 rhyskidd has quit [Quit: rhyskidd]

13:56 rhyskidd has joined #panfrost

14:08 <narmstrong> tomeu: mmind00: does the rockchip upstream driver handles implicit fencing in the drm driver ?

14:09 <tomeu> I was assuming this is being taken care of in drm core helpers, but also wanted to check

14:09 <tomeu> I'm not sure that could explain the artifacts, as randomly inserting sleeps didn't seemt o make a difference

14:10 <tomeu> I was planning to look at all the ioctls and soft jobs to see if there's anything that we should be obviously doing in flush or so

14:10 <tomeu> but hav ea bunch of stuff to do solve before I can go back to panfrost

14:33 <narmstrong> tomeu: mmind00: drm_gem_fb_prepare_fb doesn't seem to be called in the planes

14:33 <narmstrong> tomeu: try this https://www.irccloud.com/pastebin/LZAyxwvZ/

14:35 <narmstrong> Lyude: alyssa: no idea how to debug the GPU DATA_INVALID_FAULT...

14:46 <mmind00> narmstrong: you mean this one https://patchwork.kernel.org/patch/10706103/

14:46 <mmind00> narmstrong: looks like I still need to pester someone for a Reviewed-by, before I can apply that

14:48 <mmind00> narmstrong: and yep, at least on the lima side this made output a bit nicer

14:48 <narmstrong> mmind00: oh I missed it

14:48 <narmstrong> I can r-b it

14:48 <narmstrong> mmind00: can you apply it ?

14:49 <narmstrong> now you can :shipit:

15:06 rhyskidd has quit [Quit: rhyskidd]

15:06 rhyskidd has joined #panfrost

15:31 rhyskidd has quit [Quit: rhyskidd]

15:32 rhyskidd has joined #panfrost

15:39 <Lyude> narmstrong: not 100% sure myself either, I'm thinking a place to start might be getting some replays and start trying to disable various steps to see which one causes the error to go away

15:39 <Lyude> have you used panwrap before?

15:42 <narmstrong> Lyude: not at all

15:43 <Lyude> alyssa: btw, do you know if it's possible that different Mali GPUs in the t8xx generation might not all have the same fragment shader core and thus, some might only support SFBDs?

15:43 <Lyude> I've been meaning to check the Mali GPU props to make sure we aren't hitting some silly issue like that

15:43 <Lyude> narmstrong: so currently the mesa fork we have builds a libpanwrap.so

15:44 <Lyude> If you LD_PRELOAD it and run something with panfrost, it will generate a "replay" which essentially consists of an autogenerated C program based off the API interactions between kernel and panfrost

15:45 <Lyude> If you compile and run that program, it's supposed to be identical to the actual demo you ran, but since it's all C code you can modify the replay, recompile and mess around with it without needing to change panfrost

15:46 <narmstrong> Wow super cool !

15:47 <Lyude> Mhm-its a neat idea Alyssa got from Lima, I think they used to have something similar

16:05 <Lyude> narmstrong: something I forgot to mention-i haven't tested replays with the kapi update

16:05 <Lyude> So some stuff in that might need fixing

16:06 <Lyude> It generates code, but I haven't tried compiling any of it

16:08 <alyssa> tomeu: panwrap is not just for the blob, it's also for tracing our own driver :P

16:09 <Lyude> Until we have our own kmod of course

16:10 <alyssa> Lyude: "btw, do you know if it's possible that different Mali GPUs in the t8xx generation might not all have the same fragment shader core and thus, some might only support SFBDs" This is highly unlikely

16:10 <alyssa> Since SFBD/MFBD support corresponds to support for MRT, a feature mandated in ES 3.2

16:11 <Lyude> alyssa: eventually we should probably split it back out again eventually so we can have it in a position so we can use it like how freedreno uses it in addition to just having replays: e.g. allow it to translate whatever kapi calls we make to our (eventual) kernel driver to mali_kbase to make new hw bringup easier

16:11 <Lyude> alyssa: ahhh

16:11 <Lyude> Good to know :)

16:11 <alyssa> AFAIK, all t8xx is labeled as 3.2 compliant

16:12 <Lyude> (also that panwrap split won't happen for a pretty long while, but figured I'd just put that out there)

16:15 <Lyude> alyssa: does that make sense btw? I know you wanted to also figure out how to get the disasm integrated into mesa

16:15 <alyssa> Lyude: Ehh

16:15 <alyssa> The reason I had it split out is so we're sharing headers

16:15 <alyssa> And every build of mesa also builds panwrap

16:15 <Lyude> alyssa: mhm-we could just install the headers alongside mesa

16:16 <alyssa> Which means I can gaurantee panwrap is still working at every given commit and not out of sync (which is what was happening like crazy)

16:16 <Lyude> btw

16:16 <Lyude> This wouldn't happen until we have a fully functional kernel driver

16:16 <alyssa> Gotcha

16:16 <alyssa> Lyude: Also, PM?

16:32 rhyskidd has quit [Quit: rhyskidd]

16:32 rhyskidd has joined #panfrost

16:33 <narmstrong> Lyude: I’ll have a run, thanks !

17:02 rhyskidd has quit [Quit: rhyskidd]

17:03 rhyskidd has joined #panfrost

17:49 anarsoul|2 has joined #panfrost

17:53 rhyskidd has quit [Quit: rhyskidd]

17:54 rhyskidd has joined #panfrost

18:46 pH5 has joined #panfrost

19:30 cwabbott_ has joined #panfrost

19:34 cwabbott has quit [Ping timeout: 250 seconds]

19:34 cwabbott_ is now known as cwabbott

19:59 robert_ancell has joined #panfrost

20:01 <narmstrong> Lyude: currently trying, but dummy question, where does it "generate" the code ?

20:02 <Lyude> narmstrong: stdout

20:02 <narmstrong> hmm, it outputs only shaders

20:02 <Lyude> narmstrong: mind showing me?

20:03 <narmstrong> http://termbin.com/ft3f

20:04 <narmstrong> (oops, I left a debug print)

20:05 <Lyude> yeah it's definitely not loaded, and you made sure to LD_PRELOAD it?

20:05 <narmstrong> Yep

20:05 <Lyude> libpanwrap.so I mean

20:05 <narmstrong> yep

20:05 <narmstrong> `$ LD_PRELOAD=/usr/local/lib/aarch64-linux-gnu/libpanwrap.so ./kmscube`

20:06 <narmstrong> `LD_PRELOAD=/usr/local/lib/aarch64-linux-gnu/libpanwrap.so ldd` shows me libpanwrap in the list

20:06 <Lyude> might be that panwrap needs to be updated to detect the new ioctls

20:06 <narmstrong> let me retry

20:16 unoccupied has quit [Quit: WeeChat 2.2]

20:17 <Lyude> narmstrong: if what you're trying now doesn't work I bet we just need to update panwrap a bit

20:18 <narmstrong> Lyude: yeah maybe !

20:19 <Lyude> narmstrong: the way it starts tracing is basically by overriding the ioctl() functions through the LD_PRELOAD, then checking what the arguments/file names being passed through to it are

20:20 <Lyude> narmstrong: https://gitlab.freedesktop.org/lyudess/panfrost-mesa/blob/master/src/gallium/drivers/panfrost/panwrap/panwrap-syscall.c#L349

20:20 <Lyude> or rather, that's an old branch

20:21 <narmstrong> hmm yeah I have a different code

20:21 <Lyude> yeah, just go to that spot in your own branch and see what it's doing there

20:21 <narmstrong> yep, I'l have a look there, thx

20:22 anarsoul|2 has quit [Remote host closed the connection]

20:22 <Lyude> narmstrong: np, let me know if I can help anymore. probably won't be able to look at this until tonight/more likely the weekend

20:27 <narmstrong> fun stuff, mali_fd is not detected correctly

20:28 <Lyude> ahhh, lol

20:28 <narmstrong> maybe because I'm running aarch64 ?

20:28 <Lyude> that would explain it

20:28 <Lyude> narmstrong: no, most of that is tested only on aarch64

20:28 <Lyude> narmstrong: sure it has permissions to open?

20:28 <narmstrong> in fact, I get the open for /dev/mali0 but it nevers catches any ioctl for this fd

20:29 <Lyude> narmstrong: mind linking me to your branch again?

20:30 <narmstrong> https://gitlab.freedesktop.org/narmstrong/panfrost-mesa/tree/winsys-meson

20:30 anarsoul|2 has joined #panfrost

20:30 <narmstrong> I get ioctls for fds not catched by panwrap

20:31 <narmstrong> i get a mali0 open without any ioctls after, and I don't get drm either

20:31 <Lyude> dunno if I'd expect drm

20:31 <narmstrong> well, I see the drm ioctls !

20:32 <Lyude> narmstrong: oh, huh

20:32 <Lyude> (will look at your branch in just a moment)

20:34 <narmstrong> If I force the mali FD I get some code :-)

20:34 <Lyude> oh cool!

20:35 <narmstrong> Ah ah, someone does a fcntl(5, F_DUPFD_CLOEXEC, 3)

20:35 <Lyude> oh, lol

20:36 <Lyude> we will need to override fcntl then

20:36 <Lyude> a warning btw: debugging panwrap can get weird sometimes

20:36 <narmstrong> yeah, LD_PRELOAD can get weird

20:39 <narmstrong> http://termbin.com/xrhr the panwrap up to when the ioctl locks

20:41 <narmstrong> uint32_t ubuf_0_165[] = {

20:41 <narmstrong> 0x0, 0x0, 0x0, 0x0,

20:41 <narmstrong> }; doesn't look very good

20:47 <Lyude> narmstrong: does the rest of panwrap seem to work?

20:47 <Lyude> erm, s/rest of//

20:51 <narmstrong> By compiling it ? Haven’t tried

20:51 <narmstrong> But looking and the output, the last which generates the gpu fault looks bad

20:53 raster has quit [Quit: Gettin' stinky!]

20:56 <Lyude> ahh

20:57 <Lyude> narmstrong: btw: you can also view the actual AS of the job if you've got the mali_kbase debugging stuff enabled

20:58 <narmstrong> Lyude: I tried to enable the mali_kbase debug but failed miserably, can you point me to it ?

20:59 <Lyude> that's why I added ./config :), but now I need to refresh myself on what options enable this, sec

21:00 <narmstrong> Yep it’s very handy ! I enabled MALI_DEBUG but it did nothing...

21:02 <Lyude> narmstrong: feel free to change the platform name if you need https://paste.fedoraproject.org/paste/T3ALG84mZZJrf0xYqLEQUA

21:02 <Lyude> iirc that's what I had which enabled it

21:03 BenG83 has quit [Remote host closed the connection]

21:03 BenG83 has joined #panfrost

21:03 <narmstrong> Indeed I missed a few options

21:04 TheKit has quit [Ping timeout: 252 seconds]

21:07 TheKit has joined #panfrost

23:10 pH5 has quit [Quit: -_-]

23:17 <alyssa> narmstrong: Lyude: FWIW, "compiling replays and rerunning" doesn't work anymore and is no longer supported

23:17 <alyssa> But the output is stil super for debug

23:18 <Lyude> alyssa: i already mentioned it's probably broken

23:18 <alyssa> Not probably, definitely :p

23:18 <Lyude> didn't know it was unsupported tohugh

23:18 <Lyude> *though

23:19 <alyssa> After getting the first tri on the screen it mostly loses its usefulness and becomes a major maintenance burden imo

23:20 <alyssa> Reading the panwrap now

23:20 <alyssa> mali_attr is kind of suspect

23:22 <alyssa> The lack of a fragment shader core is also suspect but it's possible panwrap isn't printing them right anymore

23:24 <alyssa> Wondering if it's an ordering type/interjob dep/etc issue instead tho

23:24 <alyssa> Wait, that doesn't make a ton of sense, uh

23:43 <Lyude> alyssa: panwrap at least should have been updated alongside everything else with the uapi change, but it might be worth looking at the diffs

23:43 <alyssa> Mm