#panfrost on 2018-12-14 — irc logs at freenode.irclog.whitequark.org

2018-10-28 05:31 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - Discord Discard

00:04 <alyssa> anarsoul|2: Since rn when we send invalid cmdstreams we get an opaque DATA_INVALID_FAULT and trying to debug from there is a crapshoot

00:05 <Lyude> well keep in mind most of the people trying to debug this are still familiarizing themselves with the mesa portions of panfrost too :p

00:05 <Lyude> it's quite likely that's another reason it's taking a while

00:06 <alyssa> Hm?

00:06 <alyssa> I meant for myself too?

00:06 <anarsoul|2> alyssa: you probably need to dump cmdstream, not to validate it

00:06 <Lyude> alyssa: oh

00:06 <Lyude> I didn't know you were debugging this as well

00:06 <alyssa> Lyude: Trying to? :

00:06 <alyssa> :P

00:06 <alyssa> anarsoul|2: Dumps are massive and it's rarely obvious looking at them what's wrong

00:06 <anarsoul|2> hm

00:06 <Lyude> we could try to get the actual kernel replay in mali_kbase working

00:07 <Lyude> so then we coud actually modify the command stream and play around with it, then isolate where the issue is

00:07 <alyssa> ...That's not what's for

00:07 <Lyude> it isn't? o_O

00:07 <alyssa> It's a very specific errata workaround, not a debug feature :p

00:08 <Lyude> really? wtf

00:08 <Lyude> how exactly does that erratum work

00:10 <alyssa> Lyude: IIRC, hardware race condition in the tiler that causes geometry stuff to fail nondeterministically, so they try resubmitting the job a bunch of times until it goes through

00:11 <Lyude> LOL

00:11 <Lyude> i mean it's not terribly surprising

00:11 <Lyude> but hw bugs like that always make me laugh

00:11 <alyssa> yupyup

00:12 <Lyude> reminds me of the powergating issues with nvidia tesla (the gpu generation, not the compute workhorses)

00:12 <Lyude> where with power gating enabled some bits of firmware in vram would randomly get corrupted, so they solved it by teaching the nvidia driver to periodically reupload all of the firmware whenever power gating was on

00:13 <alyssa> Nice

00:21 <alyssa> Trying to replay an apitrace for mpv: "error: waffle_context_create failed"

00:21 <alyssa> Bwah?

00:21 <alyssa> (Tracing/replaying es2gears works)

00:22 <alyssa> Oh, nvm, it's only when I'm trying to replay with panfrost that it bugs, I can replay with softpipe

00:23 paulk-leonov has quit [Ping timeout: 268 seconds]

00:27 paulk-leonov has joined #panfrost

00:31 paulk-leonov has quit [Ping timeout: 252 seconds]

00:43 klaxa has quit [Ping timeout: 250 seconds]

00:52 paulk-leonov has joined #panfrost

00:56 paulk-leonov has quit [Ping timeout: 240 seconds]

01:06 paulk-leonov has joined #panfrost

01:29 NeuroScr has quit [Ping timeout: 240 seconds]

01:30 NeuroScr has joined #panfrost

02:18 _whitelogger has joined #panfrost

02:22 anarsoul|2 has quit [Ping timeout: 240 seconds]

02:23 urjaman has quit [Ping timeout: 250 seconds]

02:29 urjaman has joined #panfrost

02:58 TheCycoONE has joined #panfrost

03:07 TheCycoONE has quit [Ping timeout: 250 seconds]

03:13 TheCycoONE has joined #panfrost

03:50 <HdkR> Lyude: I had to purchase that Razer Blade Stealth 13.3" 4k with the MX150. So I'll just come to you with problems right? :)

04:06 <alyssa> So there's no hw protection against infinite loops in branching shaders

04:07 <alyssa> It doesn't seem to cause any problems (you can just quit the app and restart) but I was rather expecting a timeout or something :P

04:07 * alyssa remembers this for when she does raytracing on a Mali just for fun

04:07 <HdkR> alyssa: You would need to have a watchdog outside the GPU

04:08 <HdkR> Only D3D9 had HW protection against infinite loops

04:08 <alyssa> I s'pose

04:08 <alyssa> Well, I guess loops work now.

04:08 <alyssa> Just not the break part

04:09 <HdkR> er, maybe D3D9 didn't protect against it in all cases either... It only made recursive calls bottom out at 32 deep..

04:09 <alyssa> Hmmm

04:23 mifritscher has quit [Ping timeout: 252 seconds]

04:33 mifritscher has joined #panfrost

05:05 anarsoul|2 has joined #panfrost

05:19 <alyssa> Almost have loops going

06:18 <Lyude> HdkR: it won't work!

06:18 <Lyude> I told you :s

06:18 <HdkR> =O!

06:18 <HdkR> The MX150 or the laptop entirely? :P

06:19 <Lyude> The laptop will not be able to shut off the GPU and nouveau will hang on the GPU

06:20 <HdkR> neat

06:20 <HdkR> Am I not able to just have the GPU not turn on? :P

06:20 <Lyude> Very possibly not

06:21 <HdkR> Interesting. I'll have to mess with it in the days before the end of the return period

06:22 <Lyude> It's due to firmware issues Nvidia hasn't wanted to fix

06:22 <HdkR> hm

06:23 <HdkR> Firmware issues outside of the signed PMU blobs?

06:32 <tomeu> alyssa: thanks, that's useful

06:32 <tomeu> alyssa: any hints at how we'd implement that?

07:34 chewitt has joined #panfrost

08:09 chewitt has quit [Quit: Zzz..]

08:33 chewitt has joined #panfrost

08:43 ppchain has quit [Quit: No Ping reply in 180 seconds.]

08:53 <narmstrong> alyssa: here is the apitrace of kmscube (master branch of apitrace on github, master branch of kmscube) http://termbin.com/emu7

08:55 <narmstrong> alyssa: On the panwrap dump, on the last draw leading to the FAULT, it faults on the first JOB_TYPE_TILER, but I don't have a f***** clue what's wrong about this particular tiler job... and it fails systematically, maybe it needs the replay feature ?

09:02 chewitt has quit [Quit: Zzz..]

09:35 paulk-leonov has quit [Ping timeout: 252 seconds]

09:39 paulk-leonov has joined #panfrost

09:43 paulk-leonov has quit [Excess Flood]

09:52 <narmstrong> Hmm, it locks at exactly the 127th draw

09:53 <narmstrong> But if I relaunch kmscube it's ok

09:54 paulk-leonov has joined #panfrost

09:58 paulk-leonov has quit [Ping timeout: 252 seconds]

10:08 <narmstrong> and I changed the content of the draw (redrawing the same or skipping some) so it's the content of the draw, but the number

10:08 <narmstrong> *not the content

10:08 <tomeu> nice finding :)

10:09 <tomeu> could be related to the atom numbers?

10:09 <narmstrong> I should have played with that much earlier

10:09 <tomeu> ah, right

10:09 <tomeu> it resets at 256

10:09 <tomeu> and there's two atoms per draw job

10:09 <narmstrong> oh yeas, maybe

10:09 <tomeu> don't know why I don't see this problem though

10:10 paulk-leonov has joined #panfrost

10:10 <tomeu> uint8_t atom_counter = 0;

10:10 <tomeu> see allocate_atom()

10:15 * narmstrong looking at this

10:16 paulk-leonov has quit [Ping timeout: 260 seconds]

10:18 paulk-leonov has joined #panfrost

10:21 paulk-leonov has quit [Excess Flood]

10:22 paulk-leonov has joined #panfrost

10:24 <narmstrong> tomeu: nah it wraps correctly, but I suspect we don't read()

10:25 <narmstrong> `last_fragment_flushed = true;` in force_flush_fragment()

10:27 paulk-leonov has quit [Max SendQ exceeded]

10:28 paulk-leonov has joined #panfrost

10:31 paulk-leonov has quit [Max SendQ exceeded]

10:33 paulk-leonov has joined #panfrost

10:34 <tomeu> we don't read()?

10:35 <narmstrong> nop, it's disabled

10:35 <narmstrong> \o/

10:35 <narmstrong> now it works

10:37 paulk-leonov has quit [Max SendQ exceeded]

10:38 paulk-leonov has joined #panfrost

10:38 * tomeu is curious to see the diff

10:41 paulk-leonov has quit [Max SendQ exceeded]

10:43 <narmstrong> http://termbin.com/izka

10:43 paulk-leonov has joined #panfrost

10:47 paulk-leonov has quit [Max SendQ exceeded]

10:49 paulk-leonov has joined #panfrost

10:50 <narmstrong> i feel stupid, you already fixed this

10:51 <narmstrong> but it still failed

10:53 paulk-leonov has quit [Max SendQ exceeded]

10:54 paulk-leonov has joined #panfrost

10:58 paulk-leonov has quit [Max SendQ exceeded]

11:00 paulk-leonov has joined #panfrost

11:03 paulk-leonov has quit [Max SendQ exceeded]

11:05 paulk-leonov has joined #panfrost

11:08 paulk-leonov has quit [Max SendQ exceeded]

11:09 paulk-leonov has joined #panfrost

11:13 paulk-leonov has quit [Max SendQ exceeded]

11:15 paulk-leonov has joined #panfrost

11:19 paulk-leonov has quit [Max SendQ exceeded]

11:20 paulk-leonov has joined #panfrost

11:23 paulk-leonov has quit [Max SendQ exceeded]

11:23 * narmstrong Time to clean and resync with tomeu’s branch...

11:24 <narmstrong> Would be simpler if the winsys was in panfrost repo !

11:25 paulk-leonov has joined #panfrost

11:29 paulk-leonov has quit [Max SendQ exceeded]

11:31 paulk-leonov has joined #panfrost

11:35 paulk-leonov has quit [Max SendQ exceeded]

11:37 paulk-leonov has joined #panfrost

11:40 paulk-leonov has quit [Max SendQ exceeded]

11:42 paulk-leonov has joined #panfrost

11:46 paulk-leonov has quit [Max SendQ exceeded]

11:47 paulk-leonov has joined #panfrost

11:47 <tomeu> yeah, maybe we should merge it all

11:50 paulk-leonov has quit [Max SendQ exceeded]

11:52 paulk-leonov has joined #panfrost

11:56 paulk-leonov has quit [Max SendQ exceeded]

11:57 paulk-leonov has joined #panfrost

12:01 paulk-leonov has quit [Max SendQ exceeded]

12:03 paulk-leonov has joined #panfrost

12:07 urjaman has quit [Ping timeout: 250 seconds]

12:07 paulk-leonov has quit [Max SendQ exceeded]

12:08 paulk-leonov has joined #panfrost

12:10 chewitt has joined #panfrost

12:10 urjaman has joined #panfrost

12:11 paulk-leonov has quit [Max SendQ exceeded]

12:12 paulk-leonov has joined #panfrost

12:16 paulk-leonov has quit [Max SendQ exceeded]

12:17 paulk-leonov has joined #panfrost

12:21 paulk-leonov has quit [Max SendQ exceeded]

12:23 paulk-leonov has joined #panfrost

12:26 paulk-leonov has quit [Max SendQ exceeded]

12:28 paulk-leonov has joined #panfrost

12:32 paulk-leonov has quit [Max SendQ exceeded]

12:34 paulk-leonov has joined #panfrost

12:35 tomeu has left #panfrost [#panfrost]

12:37 paulk-leonov has quit [Max SendQ exceeded]

12:39 paulk-leonov has joined #panfrost

12:43 paulk-leonov has quit [Max SendQ exceeded]

12:45 paulk-leonov has joined #panfrost

12:48 paulk-leonov has quit [Max SendQ exceeded]

12:50 <chewitt> please merge down .. it's impossible to tracks stuff when there's many repo's involved

12:50 paulk-leonov has joined #panfrost

12:53 <narmstrong> chewitt: ok I have a 18.3 branch on top of lima

12:53 <narmstrong> that works

12:54 <narmstrong> no more locks

12:54 paulk-leonov has quit [Max SendQ exceeded]

12:55 <narmstrong> mesa: https://gitlab.freedesktop.org/narmstrong/panfrost-mesa/commits/lima-panfrost-18.3

12:55 <narmstrong> driver: https://gitlab.freedesktop.org/narmstrong/mali_kbase/tree/TX041-SW-99002-r27p0-01rel0_panfrost

12:55 paulk-leonov has joined #panfrost

12:55 <narmstrong> driver build: make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- KDIR=/home/narmstrong/projects/amlogic/linux-upstream/ CONFIG_NAME=config.meson-gxm

12:56 <narmstrong> mesa build: meson -Dgbm=true -Dglx=disabled -Dgallium-drivers=lima,panfrost,meson -Dplatforms=drm build

12:58 <chewitt> I'll have a poke

12:58 paulk-leonov has quit [Max SendQ exceeded]

13:01 paulk-leonov has joined #panfrost

13:05 paulk-leonov has quit [Max SendQ exceeded]

13:05 tomeu has joined #panfrost

13:07 paulk-leonov has joined #panfrost

13:11 paulk-leonov has quit [Max SendQ exceeded]

13:13 paulk-leonov has joined #panfrost

13:16 paulk-leonov has quit [Max SendQ exceeded]

13:18 paulk-leonov has joined #panfrost

13:21 paulk-leonov has quit [Max SendQ exceeded]

13:24 paulk-leonov has joined #panfrost

13:28 paulk-leonov has quit [Max SendQ exceeded]

13:30 paulk-leonov has joined #panfrost

13:34 paulk-leonov has quit [Max SendQ exceeded]

13:36 paulk-leonov has joined #panfrost

13:40 paulk-leonov has quit [Max SendQ exceeded]

13:42 paulk-leonov has joined #panfrost

13:45 chewitt has quit [Quit: Zzz..]

13:45 paulk-leonov has quit [Max SendQ exceeded]

13:47 paulk-leonov has joined #panfrost

13:51 paulk-leonov has quit [Max SendQ exceeded]

13:53 paulk-leonov has joined #panfrost

13:55 paulk-leonov has quit [Remote host closed the connection]

14:11 chewitt has joined #panfrost

14:22 chewitt has quit [Quit: Zzz..]

14:25 chewitt has joined #panfrost

14:31 chewitt has quit [Quit: Zzz..]

14:41 raster has joined #panfrost

15:22 chewitt has joined #panfrost

15:24 <narmstrong> tomeu: ok, your fixed was badly applied ;-) next time I cherry-pick your patches...

15:24 <narmstrong> alyssa: src/gallium/drivers/panfrost/include/meson.build breaks cross-compilation !

15:28 <narmstrong> damn I feel stupid...

15:28 chewitt has quit [Quit: Zzz..]

15:29 chewitt has joined #panfrost

15:31 <Lyude> narmstrong: that would be my fault

15:31 <Lyude> but also I only use cross compilation

15:31 <Lyude> what exactly is breaking?

15:32 <narmstrong> `src/gallium/drivers/panfrost/include/meson.build:3:16: ERROR: Can not run test applications in this cross environment.`

15:32 <Lyude> narmstrong: gah, it's supposed to print a notice that you can workaround it by setting page_size manually in your toolchain file. tbh though setting it up so you can run test applications with cross compilation isn't terribly difficult

15:32 <Lyude> (that's how I have it setup here, lemme grab my toolchain file)

15:33 <narmstrong> I did an horrible hack https://gitlab.freedesktop.org/narmstrong/panfrost-mesa/commit/a6400f743b3355df4d934161dad8a151043b2d07

15:33 <Lyude> narmstrong: https://paste.fedoraproject.org/paste/LocNymLWyqN~yjCkAtUPqQ

15:33 <Lyude> in that instance /mnt/amethyst is an nfs mount of my vim2's rootfs

15:34 <narmstrong> interesting, but it's to cross-build in a build-system (LibreELEC for Kodi) so this kind of workaround is hard to implement

15:36 <Lyude> narmstrong: ahhh, just set page_size = 12 in your toolchain properties then

15:36 <Lyude> or page_shift rather

15:37 <Lyude> unfortunately without that defined mali_kbase doesn't really compile

15:38 <narmstrong> bad !

15:38 <narmstrong> did someone fixed the `mali_ptr ptr : 64 - 10;` error on 32bit builds ?

15:40 <Lyude> I haven't tried yet-should probably get my T6xx system setup for that

15:40 <Lyude> narmstrong: yeah it's annoying

15:41 <chewitt> LE has 64bit kernel and 32bit userspace for compatibility with 32bit only libwidevine for Netflix/Amazon support

15:41 <Lyude> almost considered just adding a constructor to our mesa driver to just grab the pagesize/pageshift at runtime, then point the macro that mali_kbase uses at the global variable

15:42 <narmstrong> Lyude: the macro could also store it once at first call and use a global to the rest of the runtime (dirty but won

15:42 <narmstrong> 't impact performance

15:42 <Lyude> narmstrong: mhm, true

15:42 <Lyude> either works

15:43 <narmstrong> So the ARM Mali blob had this hardcoded ?

15:43 <Lyude> narmstrong: I think? iirc android defines PAGE_SIZE

15:43 <Lyude> *PAGE_SHIFT

15:44 <Lyude> it's hard to tell

15:51 <alyssa> tomeu: Soooort of? In mali_payload_fragment, we have a pair for min_tile_coord/max_tile_coord, which controls (up to tile granularity) how much we writeout. (TODO: Combine with scissoring for a substantial perf opimization, probably). If you do a partial writeout, I'm assuming you need to set those appropriately to only "select" the window you care about. If the partial writeout doesn't divide nicely along tile lines (8x8 pixel tiles),

15:51 <alyssa> I'm assuming you also need to do a driver-side glScissor.

15:51 <alyssa> That said, I don't know how the above scheme translates into the corresponding OpenGL extension

15:51 <alyssa> I could look into it if you'd like (I'm itching to do some real work again :P)

15:52 <tomeu> alyssa: well, that would be great!

15:52 <tomeu> even if you only get far enough to give some more detailed directions, that would be helpful

15:52 <tomeu> my panfrost work keeps being trampled by other stuff, but hopefully next week it will be calmer

15:53 <alyssa> Sure :)

15:54 <tomeu> if partial updates worked, then it would be much more comfortable to work on egl wayland clients and fixing things such as the gallium HUD

15:54 <alyssa> Alright

15:56 <Lyude> btw tomeu: I will make sure to create branches and stuff on the main panfrost page this weekend

15:56 <alyssa> tomeu: The problem with partial updates, fwiw, is that there's some intense pipelining going on and tilers have to clear everything everytime, so unless you use the partial update egl extensions, the driver has to incur a serious performance hit loading everything back into memory first

15:56 <narmstrong> Lyude: for the mali_kbase driver, I did a clean rebase and clean support for the Amlogic S912, did you have a look to it ? https://gitlab.freedesktop.org/narmstrong/mali_kbase/tree/TX041-SW-99002-r27p0-01rel0_panfrost

15:56 <narmstrong> Lyude:

15:57 <narmstrong> Lyude: you should not have the initial fault anymore

15:57 <narmstrong> Lyude: seems Amlogic failed to integrate the T820 correctly and we need to manually enable/reset the cores

15:58 <alyssa> Rockchip <3

15:58 chewitt has quit [Quit: Zzz..]

15:59 <narmstrong> alyssa: does the mali_ptr need to be 64bit ?

15:59 <urjaman> yes

16:00 <urjaman> mali has 64bit pointers even if the ARM core has 32bit

16:00 <alyssa> ^^

16:00 <Lyude> narmstrong: a-ha!

16:00 <Lyude> I guessed that might be the case :)

16:00 <narmstrong> ok so mali_ptr should not be uintptr_t then...

16:00 <alyssa> Definitely not, no.

16:01 <Lyude> I may have done that by accident, whoops :p

16:01 <Lyude> narmstrong: anyway-I'll take a look next chance I get

16:01 <alyssa> No worries :)

16:01 <alyssa> (On my old branch, we have u64 mali_ptr. No idea what happened with the kbase stuff)

16:01 <narmstrong> No problem, it's still very WiP ;-) just trying to make it go further !

16:13 chewitt has joined #panfrost

16:29 chewitt has quit [Quit: Zzz..]

16:56 klaxa has joined #panfrost

18:13 anarsoul|2 has quit [Ping timeout: 272 seconds]

18:14 raster has quit [Remote host closed the connection]

18:17 jernej has joined #panfrost

18:25 <hanetzer> finally got distcc setup :)

18:36 chewitt has joined #panfrost

18:43 chewitt has quit [Quit: Zzz..]

18:54 chewitt has joined #panfrost

18:59 anarsoul|2 has joined #panfrost

19:00 chewitt has quit [Quit: Zzz..]

19:25 belgin has joined #panfrost

19:35 NeuroScr has quit [Quit: NeuroScr]

21:09 belgin has quit [Quit: Leaving]

21:54 urjaman has quit [Ping timeout: 250 seconds]

21:57 belgin has joined #panfrost

22:01 urjaman has joined #panfrost

22:08 pH5 has quit [Quit: -_-]

22:56 AntonioND has joined #panfrost

23:46 AntonioND has quit [Quit: Quit]