#panfrost on 2019-07-09 — irc logs at freenode.irclog.whitequark.org

2019-02-15 17:52 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:06 rhyskidd has quit [Ping timeout: 268 seconds]

00:11 yann has joined #panfrost

00:14 rhyskidd has joined #panfrost

00:22 calcprogrammer1 has joined #panfrost

00:26 NeuroScr has quit [Quit: NeuroScr]

00:26 <calcprogrammer1> I'm trying to test Panfrost on my Rock Pi 4B. I'm using the official Debian image, dist-upgraded to buster. Built 5.2 kernel in a debian arm64 chroot and it's booting fine. Panfrost is loaded, /sys/class/drm/card0 and card1, renderD128 all there. Built several versions of Mesa several different ways (debian packaging, ninja install, with git master, 19.1, 19.1.1). Nothing seems to work though. I've tried

00:26 <calcprogrammer1> startx/lightdm, kmscube, and weston. Every time I start something that should use Panfrost, I get kernel errors:

00:26 <calcprogrammer1> [ 235.934202] panfrost ff9a0000.gpu: js fault, js=1, status=DATA_INVALID_FAULT, head=0x2400f00, tail=0x2400f00

00:26 <calcprogrammer1> [ 235.935146] panfrost ff9a0000.gpu: gpu sched timeout, js=1, status=0x58, head=0x2400f00, tail=0x2400f00, sched_job=000000005114aee9

00:30 <HdkR> Ah, there was a bug about that just recently

00:34 <calcprogrammer1> I think I saw it: https://bugs.freedesktop.org/show_bug.cgi?id=111036?

00:34 <calcprogrammer1> OP said cloning master fixed it for him, but it didn't for me

00:42 <HdkR> Maybe you need a newer kernel

00:50 <calcprogrammer1> I'm using 5.2 final

01:05 vstehle has quit [Ping timeout: 244 seconds]

01:12 rhyskidd has quit [Quit: rhyskidd]

01:56 NeuroScr has joined #panfrost

02:27 Elpaulo has joined #panfrost

03:13 indy has quit [Ping timeout: 272 seconds]

03:53 marvs has joined #panfrost

03:59 davidlt has joined #panfrost

04:19 davidlt has quit [Ping timeout: 245 seconds]

04:47 NeuroScr has quit [Quit: NeuroScr]

05:00 vstehle has joined #panfrost

07:44 yann has quit [Ping timeout: 272 seconds]

08:21 tgall_foo has quit [Ping timeout: 272 seconds]

08:27 cwabbott has quit [Quit: cwabbott]

08:27 cwabbott has joined #panfrost

08:31 cwabbott has quit [Client Quit]

08:31 cwabbott has joined #panfrost

08:42 cwabbott has quit [Quit: cwabbott]

08:42 cwabbott has joined #panfrost

08:48 yann has joined #panfrost

08:51 cwabbott has quit [Quit: cwabbott]

08:51 cwabbott has joined #panfrost

09:17 raster has joined #panfrost

09:23 raster has quit [Remote host closed the connection]

09:24 raster has joined #panfrost

09:31 rhyskidd has joined #panfrost

10:10 adjtm has quit [Ping timeout: 248 seconds]

10:14 davidlt has joined #panfrost

10:37 davidlt has quit [Ping timeout: 272 seconds]

10:41 davidlt has joined #panfrost

10:46 adjtm has joined #panfrost

10:47 davidlt has quit [Ping timeout: 272 seconds]

10:50 davidlt has joined #panfrost

10:59 davidlt has quit [Ping timeout: 246 seconds]

11:20 afaerber has quit [Quit: Leaving]

11:36 afaerber has joined #panfrost

11:50 davidlt has joined #panfrost

11:55 xHire has quit [*.net *.split]

11:55 empty_string has quit [*.net *.split]

11:57 xHire has joined #panfrost

12:01 empty_string has joined #panfrost

12:08 yann has quit [Remote host closed the connection]

13:33 <alyssa> calcprogrammer1: "Rock Pi 4B" is RK3399, yes?

13:35 <calcprogrammer1> yes

13:35 <alyssa> Alright, that's supported.

13:36 <alyssa> Keep in mind, 19.1 is not supported.

13:37 <alyssa> I don't know why people are trying to use it; Panfrost is specifically disabled in 19.1 since it wasn't ready for end-users and then people reenabled it and suddenly bugs happen (I wonder why)...

13:37 <alyssa> 19.2, on the other hand, I hope will be great!

13:46 jernej has quit [Quit: Free ZNC ~ Powered by LunarBNC: https://LunarBNC.net]

13:51 <davidlt> alyssa, are you planing to get Pinebook Pro?

13:54 <alyssa> davidlt: I mean, I already have more RK3399 devices than I know what to do with ;)

13:55 <davidlt> true, but it's a portable test bench with screen, keyboard and battery ;)

13:55 <alyssa> So is Kevin

13:56 <davidlt> silicon wise -- yes

13:56 <davidlt> I don't think Pinebook Pro use USB PD for charging

13:56 <davidlt> There is BT 5.0 support in Pinebook Pro, ability to use NVMe M.2, removable MMC storage

13:57 <davidlt> and new addition: privacy switches

13:57 herbmilleriw has joined #panfrost

13:59 <daniels> alyssa: is 19.1 still going to see another point release? if so, might be good to add a fprintf(stderr, "literally don't use this please\n"); to screen init

14:08 <tomeu> alyssa: wonder what could be done about AFBC and modifiers, if we don't know how to drive the display and GPU HW for each of the supported AFBC variants

14:09 <tomeu> guess AFBC_FORMAT_MOD_ROCKCHIP refers to a specific variant

14:13 <daniels> yeah, it does

14:15 <daniels> the definition is down in a vendor tree somewhere

14:15 <tomeu> and guess we don't know which?

14:17 <daniels> annoyingly I can't find the thread where AFBC_FORMAT_MOD_ROCKCHIP was actually submitted - I guess we'd have to figure out what that actually maps to in terms of the existing modifiers

14:17 <daniels> probably the best way would be to ask the ChromeOS team to have a look and document it?

14:18 <tomeu> yeah, we could use some help here

14:46 jernej has joined #panfrost

14:48 <alyssa> davidlt: Hey, I *like* type-C ;)

14:52 <alyssa> tomeu: https://lkml.org/lkml/2016/9/9/882

14:53 <tomeu> yeah, I have checked the chromeos and rockchip kernels, and are equally unhelpful

14:53 <alyssa> tomeu: https://lkml.org/lkml/2016/9/13/423

14:53 <alyssa> RK never replied to that..

14:54 <tomeu> yeah, i guess only the HW people know the details

14:55 <alyssa> Granted, I'm not sure what exactly our hw does natively (on the GPU) and, assuming there are multiple configurations, how to poke at the other ones

14:55 <alyssa> We know it's 16x16

14:55 <alyssa> We don't know which of the other modifiers it does

14:57 * alyssa grabs notes

15:00 <raster> argh

15:00 <raster> why doesnt ayufan upstream his dts's for the rockpro64?

15:00 * raster hates ending up with kernels that dont boot because they cant find the emmc...

15:01 <alyssa> Arrr

15:05 <raster> aye

15:05 <raster> 2 gl clients in a wl compositor does not make panfrost happy :)

15:06 <tomeu> that works here

15:06 <raster> let me get my mesa totally up to date

15:06 <raster> tomeu: thinged enede up frozen for me

15:06 <raster> userspace - couldnt ssh in :)

15:06 <raster> using linux-next

15:07 <raster> i had to fix linux-next - build was broken. amdgpu drivers and new HMM stuff was broken

15:07 <raster> didnt even copile

15:09 <raster> # of build targets for mesa keeps going up. its 1008 now for me :)

15:09 <raster> i remember it was like ~700 not long ago

15:09 <raster> then 900 odd

15:09 <raster> now over 1k

15:10 <alyssa> tomeu: /me grumbles about code bloat

15:10 <raster> code always bloats

15:10 <raster> or it is dead

15:10 <raster> :)

15:18 <rcf> raster: on boring 5.2 I get similar results, though it didn't completely lock up.

15:19 <raster> well i wate3d a min or 2 to ssh login

15:19 <raster> gave up and hit reset :)

15:19 <raster> it may have been working... slowly :)

15:20 raster has quit [Remote host closed the connection]

15:20 <tomeu> OOM?

15:20 <rcf> If I weren't focused in the terminal and able to hit Ctrl+C quickly it may not have ended so easily.

15:20 raster has joined #panfrost

15:20 <rcf> That would seem to be the case, actually.

15:21 <alyssa> OOM is very likely

15:23 <raster> i have this weird thing still- this is newish in panfrost

15:23 <raster> in a tty

15:24 <raster> ELM_ACCEL=none elementary_test -to animation

15:24 <raster> ok

15:24 <raster> work s- sw rendering. drm/kms ok

15:25 <raster> enlightenment - doesnt work - software or gl init

15:25 <raster> now

15:25 <raster> LM_ACCEL=gl elementary_test -to animation

15:25 <raster> fails

15:25 <raster> _evas_outbuf_egl_setup() eglCreateWindowSurface() fail for 0xaaab11ad74f0. code=0x3003

15:25 <raster> havent quite dug into what config/params it is

15:25 <raster> BUT

15:26 <raster> the interesting thing is after this

15:26 <raster> enlightenment does start

15:26 <raster> something is up/screwy in kernel context somewhere

15:26 <raster> and gl works

15:26 <raster> this failed init then triggers future processes to work

15:26 <raster> well elm test wont work

15:26 <raster> no matter what

15:29 <raster> but

15:29 <raster> fbos seem to function now

15:29 <raster> yay

15:30 <raster> stuff not totally broken

15:30 <raster> woot

15:30 <daniels> 0x3003 is EGL_BAD_ALLOC, which is usually only returned if you try to create two EGL surfaces from the same native window

15:30 <daniels> if you run with EGL_LOG_LEVEL=debug, it should tell you where/why the error is being generated

15:31 <daniels> given FBOs are working, it's not going to be down in the kernel, since the path to create an FBO and the path to create a native surface are exactly the same as far as the driver/kernel are concerned; the only difference is in higher API layers

15:32 <raster> oh fbo's are parallel

15:32 <raster> this is long pre any fbos

15:32 <daniels> right, but I mean, if you can allocate FBOs and not window surfaces, then I would definitely be looking at EGL usage rather than 'something is up/screwy in kernel context'

15:33 <raster> well more one process changes the behavior of a future process

15:33 <raster> 2 unrelated processes

15:33 <raster> 1nd always fails to init

15:33 <raster> run aother that fails too

15:33 <raster> then first succeeds after that

15:34 <raster> obviously some out-of-process state changed permanently

15:34 <raster> my guess its kernel state

15:34 <raster> well there we go

15:34 <raster> its going super slow again

15:35 <raster> :)

15:35 <raster> 2 gl using wl clients hammering away - panfrost is not happy

15:35 <raster> and i'm pretty up to date now

15:36 <alyssa> Probably still OOM..

15:36 <alyssa> How much memory is on the board?

15:36 <raster> 4gb

15:36 <raster> :)

15:36 <alyssa> Hmm :/

15:37 <raster> ok

15:37 <raster> i got a shell

15:37 <raster> just

15:37 <raster> free -m

15:37 <raster> waiting for the results

15:37 <raster> :)

15:37 <raster> not sure its oom

15:37 <raster> nope

15:37 <raster> 247m used

15:37 <alyssa> Hrm

15:37 <raster> i think its gpu resets

15:37 <alyssa> That would do it, yes

15:37 <raster> i see a lot of those

15:37 <alyssa> raster: If you're able, could you launch with:

15:38 <alyssa> PAN_MESA_DEBUG=trace

15:38 <alyssa> That will spam a *lot* to stdout

15:38 <alyssa> So redirect that to a file, compress it, and send it to me

15:38 <raster> oh dear that will be spam from multiple processes

15:38 <raster> remember this is gl wl compositor with 4 gl wl clients

15:38 <alyssa> Presumably you'll have GPU faults even with 1 client..?

15:39 <raster> 2 of those are hammering away at 60fps with animation (simple stuff tho - like 10 triangles each)

15:39 <raster> oh i see it often - yeah

15:39 <raster> oh god

15:39 <raster> dmesg is taking forever

15:39 <raster> lots of unhandles pagefaults

15:39 <raster> err unhandled pagefaults

15:39 <alyssa> Yeah, I need a trace then

15:40 <alyssa> And preferably also a snippet of dmesg (just enough to see some representative failing accesses)

15:40 davidlt has quit [Ping timeout: 245 seconds]

15:40 <raster> also drm atomic wait for vblank timeout too

15:40 <raster> i need to get something simpler/cleaner for u than this mess

15:40 <raster> :)

15:41 <alyssa> That would be apreciated! :P

15:41 <alyssa> If you run E alone without any clients, does it fault too?

15:41 <raster> just letting u know the rough kind of things i am seeing

15:41 <raster> well gettign e to start is the above adventure

15:41 <raster> it fails to init at all unless i get elementary_Test to fail to init egl first

15:42 <raster> something is up therte with 1 process changing some kernel state so another then succeeds

15:42 <raster> well i smell a rat :)

15:42 <raster> i'll start with that

15:42 <alyssa> Hm?

15:42 <raster> let me reboto

15:42 <raster> this is unusable :)

15:42 <raster> reboot :)

15:44 <raster> ok

15:44 <raster> 1st run

15:44 <raster> oh wait

15:44 <raster> it works now

15:44 <raster> wtf?

15:44 <raster> this is random

15:46 <raster> ok

15:46 <raster> e is up

15:46 <raster> no dmesg complaints

15:46 <raster> all good

15:47 <raster> 1 terminology up - gl rendering with htop (or should be gl)

15:48 <raster> ok

15:48 <raster> first reset

15:48 <raster> clean

15:49 <raster> https://phab.enlightenment.org/P309

15:50 <daniels> raster: wait timeouts are to be expected when the GPU's hung - we're asking KMS to display something which will never be ready

15:50 <raster> yeah

15:50 <raster> tho pagefaults come with it

15:54 <raster> hmmm

15:55 <raster> oh crap

15:55 <raster> its gone into slow mode

15:55 <raster> i havent enabled any debug env vars yet

15:55 <raster> but...

15:56 <raster> DRM_IOCTL_PANFROST_CREATE_BO failed: -1

15:56 <raster> mmap failed: 0xffffffffffffffff

15:56 <raster> DRM_IOCTL_PANFROST_MMAP_BO failed: -1

15:56 <raster> THAT doesnt sound good :)

15:56 <raster> someone is not checking mmap return :)

15:56 <raster> now lets get u some more

15:56 <raster> gah

15:57 <raster> system unusable again :(

15:57 <raster> have to roll now - allhandies :)

15:59 herbmilleriw has quit [Quit: Konversation terminated!]

16:42 herbmilleriw has joined #panfrost

16:47 herbmilleriw has quit [Quit: Konversation terminated!]

17:01 herbmilleriw has joined #panfrost

17:27 <raster> alyssa: http://www.rasterman.com/files/mesa-panlog.xz

17:27 <raster> funtimes

17:27 <raster> :)

17:30 <alyssa> raster: And a snippet of the MMU faults when that was taken..?

17:30 <raster> hmm

17:30 <raster> u mean dmesg?

17:30 <alyssa> Ye

17:30 <raster> oh

17:30 <raster> damn

17:30 <raster> lost that

17:30 * alyssa doesn't see any faults from that log

17:31 <raster> but the look bery much like

17:31 <raster> https://phab.enlightenment.org/P309

17:31 <raster> timestamps will vary a bit but i see the same kind of thing a lot

17:31 <alyssa> OK, that helps

17:31 <raster> always pagefault in AS0 at VA 0...

17:32 <alyssa> Problem is, I don't see any faulting descriptors in that log

17:32 <alyssa> You did mention it was maybe sporadic?

17:32 <raster> i can get it to happen often if i have 2 gl clients continually rendering

17:33 <raster> i get this eventually with glmark too in kms rendering too (no compositor)

17:33 <raster> so its relatively common to see this stuff

17:33 <raster> just wondering

17:34 <raster> it's not common/easy to see for you?

17:36 <alyssa> Not like you're describing, no

17:37 <raster> hmm interesting

17:37 <raster> ok

17:37 <raster> i guess i shell have to make an effort to create lots of them for you :)

17:37 <raster> lots and lots and lots and lots.... :)

17:37 <raster> and more

17:38 <raster> and more.... and lots... and more and lots and way more and looooots :)

17:38 <alyssa> Just one is good if there's a good trace :p

17:38 <raster> but you need a good one :) - i made a script

17:38 <raster> i'll get the compositor log too

17:40 <alyssa> Hey, now I'm getting them on my end too

17:40 <alyssa> I think it's contagious

17:40 <raster> MUHAHAHHAHA

17:41 <alyssa> ...No, that's a different fault

17:45 <raster> hmmm

17:45 <raster> new fun

17:45 <raster> grrrr

17:46 <raster> stuff locked up wiht no dmesg log there

17:49 <raster> oh fantastic

17:49 <raster> e locked up at 100% cpu

17:49 <raster> hmmm'

17:50 <raster> yeah

17:51 <raster> pandecode badness

17:52 <raster> https://phab.enlightenment.org/P310

17:52 <raster> stuck there in an infinite loop

17:53 <raster> seems to stay there

17:53 <raster> time to roll before it rains

17:53 raster has quit [Remote host closed the connection]

17:54 <alyssa> raster: okay, that was admittedly wacky code

17:54 <alyssa> Instancing support is veeery much experimental

17:55 <alyssa> Speaking of, is E using instancing? I thought we're only exposing GLES2

17:55 * alyssa has a branch to stop exposing the extension incorrectly

18:06 TheCycoTWO is now known as TheCycoONE

18:14 stikonas has joined #panfrost

18:42 Lyude has quit [Read error: Connection reset by peer]

18:44 Lyude has joined #panfrost

19:08 davidlt has joined #panfrost

19:21 afaerber has quit [Quit: Leaving]

19:25 TheKit has quit [Ping timeout: 258 seconds]

19:25 TheKit has joined #panfrost

19:32 BenG83 has joined #panfrost

19:33 afaerber has joined #panfrost

19:52 davidlt has quit [Ping timeout: 245 seconds]

20:17 stikonas has quit [Ping timeout: 276 seconds]

20:25 BenG83 has quit [Remote host closed the connection]

20:32 stikonas has joined #panfrost

20:54 stikonas_ has joined #panfrost

20:55 stikonas has quit [Read error: Connection reset by peer]

21:50 vstehle has quit [Ping timeout: 245 seconds]

22:17 vstehle has joined #panfrost

22:39 <hanetzer> kernel 5.2 hit my distro sometime recently :D

23:02 stikonas_ has quit [Remote host closed the connection]

23:05 <anarsoul> hanetzer: congrats :)

23:06 <anarsoul> now you run new shiny kernel with in-tree panfrost

23:09 <hanetzer> aye. gotta do some building :P

23:10 NeuroScr has joined #panfrost

23:38 Lyude has quit [Ping timeout: 246 seconds]

23:58 Lyude has joined #panfrost