#panfrost on 2020-07-20 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:18 macc24 has quit [Ping timeout: 260 seconds]

00:36 stikonas has quit [Ping timeout: 272 seconds]

01:05 cwabbott has quit [Quit: cwabbott]

01:06 cwabbott has joined #panfrost

01:25 kaspter has quit [Ping timeout: 246 seconds]

01:25 kaspter has joined #panfrost

03:30 rhyskidd has quit [Remote host closed the connection]

03:32 rhyskidd has joined #panfrost

03:38 rhyskidd has quit [Remote host closed the connection]

03:41 davidlt has joined #panfrost

03:59 vstehle has quit [Ping timeout: 258 seconds]

04:06 kaspter has quit [Ping timeout: 256 seconds]

04:06 kaspter has joined #panfrost

04:20 icecream95 has joined #panfrost

04:58 icecrea105 has joined #panfrost

05:00 vstehle has joined #panfrost

05:01 icecream95 has quit [Ping timeout: 264 seconds]

05:12 macc24 has joined #panfrost

05:57 fysa has quit [Ping timeout: 256 seconds]

06:10 icecrea105 has quit [Quit: leaving]

06:10 icecream95 has joined #panfrost

06:31 anarsoul|2 has quit [Ping timeout: 256 seconds]

06:33 anarsoul has joined #panfrost

06:51 guillaume_g has joined #panfrost

06:54 robmur01_ has joined #panfrost

06:57 robmur01 has quit [Ping timeout: 240 seconds]

07:22 raster has joined #panfrost

09:21 <tomeu> narmstrong: chewitt: can you remind me what's the mali_kbase that should work on the odroid n2?

09:22 <chewitt> https://github.com/LibreELEC/mali-bifrost/commits/BX301A01B-SW-99002-r16p0-01rel0

09:22 <tomeu> thanks~

09:22 <tomeu> thanks!, even

09:23 <chewitt> also need this in the kernel https://github.com/chewitt/linux/commit/69fdb89dd930fbf72b8ac4ab3a842abdc324ede1

09:23 <chewitt> I didn't figure out what changes are needed in kbase to fix for that yet (not a high priority)

09:23 stikonas has joined #panfrost

09:25 rhyskidd has joined #panfrost

09:33 <tomeu> ok, I just commented those lines out in kbase

09:33 <tomeu> got it working now

09:41 <chewitt> btw, I did a bunch of testing after the amlogic email

09:42 <chewitt> T820 with Neil's patch for gxm/g12 mali devices .. all fine

09:42 <chewitt> G31 with the patch .. dmesg is clean

09:42 <chewitt> G52 with the patch .. shows some faults

09:43 <chewitt> T820 no patch .. shows some errors but Kodi still comes up and works fine

09:43 <chewitt> G31 no patch .. shows errors and Kodi shows black screen but does appear to start in the background (can hear audio noises from the GUI)

09:44 <chewitt> G52 no patch .. same as G31

09:44 <chewitt> bifrost devices .. lots of scheduler faults and such at the same time

09:44 maccraft has joined #panfrost

09:45 <maccraft> hello, is hardware video decoding supported?

09:45 <chewitt> so Sky Zhou's comment that bifrost doesn't need the special handling and T820 does .. is the opposite of my findings :)

09:46 <chewitt> maccraft panfrost has nothing to do with hardware video decoding.. that's another chip

09:46 <maccraft> chewitt: oh

09:47 <chewitt> panfrost is accelerated GL/GLED (GUI rendering) only

09:47 <chewitt> GL/GLES

09:47 <tomeu> chewitt: just did the same testing here, and with MALI_PLATFORM_NAME=devicetree and without MALI_PLATFORM_POWER_DOWN_ONLY, stuff works fine on the n2

09:48 <tomeu> so looks as if the GPU integration on the soc on the n2 is just normal

09:52 <l-as> what patch is this?

09:56 maccraft has quit [Quit: WeeChat 2.8]

09:58 ckeepax has joined #panfrost

10:04 <chewitt> https://github.com/chewitt/linux/commit/9f7dcd3199688c89a4b8ec15183d2fe1654169cf

10:05 <l-as> thanks!

10:06 <chewitt> tomeu with the patch added all GPUs start and reach "[drm] Initialized panfrost 1.1.0 20180908 for d00c0000.gpu on minor 1" with no errors

10:06 <chewitt> errors come afterwards

10:06 <tomeu> chewitt: yeah, but kbase seems to do fine without that

10:08 <l-as> chewitt: I assume this patch supersedes `drm/panfrost: add support for custom soft-reset on Amlogic GXM` ?

10:08 stikonas has quit [*.net *.split]

10:08 ckeepax has quit [*.net *.split]

10:08 TheKit has quit [*.net *.split]

10:08 nlhowell1 has quit [*.net *.split]

10:08 rtp_ has quit [*.net *.split]

10:09 gcl has quit [*.net *.split]

10:09 <chewitt> it's the same patch but also including the meson-g12-mali compatible

10:09 raster has quit [Quit: Gettin' stinky!]

10:09 <l-as> huh, thanks

10:10 ckeepax has joined #panfrost

10:10 TheKit has joined #panfrost

10:10 stikonas has joined #panfrost

10:10 nlhowell1 has joined #panfrost

10:10 rtp_ has joined #panfrost

10:10 gcl has joined #panfrost

10:10 <tomeu> robmur01_: robher: is there a branch somewhere collecting all panfrost patches?

10:13 raster has joined #panfrost

10:14 <robmur01_> tomeu: not that I'm aware of, beyond the ones that you and narmstrong were keeping a while back

10:14 icecream95 has quit [Ping timeout: 260 seconds]

10:14 robmur01_ is now known as robmur01

11:04 <tomeu> robmur01: ok, thanks

11:10 <chewitt> tomeu https://patchwork.kernel.org/project/dri-devel/list/?series=&submitter=&state=*&q=panfrost&archive=both&delegate=

11:11 <chewitt> that's the search I use for tracking patches

11:12 <tomeu> hrm, looks we have some work to do

11:13 <chewitt> I was trying to pick stuff into my branch for testing but eventually I ran into bits that didn't apply clean or were part of other large series

11:14 <chewitt> most of it looked like cleanup/ettiquette rather than a good old-fashioned fix

11:15 <chewitt> then the day-job intervened and I got behind, and didn't work up the enthusiasm to catch-up again yet

11:42 nlhowell1 has quit [Ping timeout: 240 seconds]

11:47 rhyskidd has quit [Read error: Connection reset by peer]

12:00 nlhowell1 has joined #panfrost

12:07 gtucker has joined #panfrost

12:09 stikonas_ has joined #panfrost

12:09 stikonas has quit [Ping timeout: 240 seconds]

12:22 nlhowell1 has quit [Ping timeout: 256 seconds]

12:38 <daniels> chewitt, robmur01: I find this more friendly FWIW https://patchwork.freedesktop.org/project/dri-devel/series/?ordering=-last_updated

13:18 nlhowell1 has joined #panfrost

13:18 Ntemis has joined #panfrost

13:19 Ntemis has quit [Read error: Connection reset by peer]

13:27 <alyssa> tomeu: "MALI_PLATFORM_NAME=devicetree and without MALI_PLATFORM_POWER_DOWN_ONLY, stuff works fine on the n2" that much is good to hear

13:31 <mchehab> hi, I'm trying to test the Panfrost driver on a SoC with a Mali T720... there is already a working drm/kms driver for such hardware, but when I'm trying to enable Panfrost, X doesn't start, returning:

13:31 <mchehab> [ 360.880] (II) modeset(G0): using drv /dev/dri/card1

13:31 <mchehab> [ 360.880] (EE) Cannot run in framebuffer mode. Please specify busIDs for all framebuffer devices

13:32 <mchehab> I'm using the version at Kernel 5.7

13:34 <tomeu> mchehab: don't know about X any more, but I don't see anything in that referring to panfrost

13:34 <tomeu> all seems to be display-related

13:35 <mchehab> tomeu, basically, card0 there is the drm/kms driver, while card1 is the Panfrost driver

13:35 <urjaman> you shouldnt be telling Xorg about panfrost

13:36 <urjaman> or pointing it to the panfrost DRM device, that is

13:36 <mchehab> I didn't make any changes at Xorg settings

13:36 <alyssa> mchehab: What SoC is this?

13:36 <alyssa> and specifically what drm/kms driver name?

13:36 <mchehab> it is a HiKey970

13:36 <urjaman> but yeah I dunno, it just works for my C201 so ... *shrug*

13:37 <tomeu> mchehab: guess when you enabled panfrost, then it probed before the kms driver and thus the device file name changed?

13:37 <alyssa> isn't H970 bifrost?

13:37 <mchehab> [ 3.475015] panfrost e82c0000.gpu: clock rate = 550000000

13:37 <mchehab> [ 3.636768] panfrost e82c0000.gpu: mali-g72 id 0x6221 major 0x0 minor 0x0 status 0x1

13:38 <mchehab> [ 3.662139] panfrost e82c0000.gpu: Features: L2:0x07130206 Shader:0x00000000 Tiler:0x00000809 Mem:0x101 MMU:0x00002830 AS:0xff JS:0x7

13:38 <mchehab> [ 3.681561] panfrost e82c0000.gpu: shader_present=0xfff l2_present=0x1

13:38 <mchehab> [ 3.694680] [drm] Initialized panfrost 1.1.0 20180908 for e82c0000.gpu on minor 0

13:38 <mchehab> [ 3.645063] panfrost e82c0000.gpu: features: 00000000,13de77ff, issues: 00000000,00000400

13:38 <alyssa> G72/X isn't supported yet

13:38 <mchehab> does it work with Wayland?

13:39 <alyssa> haven't tried

13:39 <alyssa> so a firm maybe

13:43 <mchehab> tomeu: both drivers were built with "Y"

13:44 <tomeu> alyssa: actually, I may have to back that claim, as it's not working any more after some more kernel rebuilding

13:44 <tomeu> I'm surprised though, because I did add instrumentation to make sure that the expected code paths were being hit

13:49 <tomeu> hmm, I think it could have been a reboot

13:49 <tomeu> that's an interesting possibility

13:50 <robmur01> note that even with "PAN_MESA_DEBUG=bifrost" it's going to bail out on G72, without additional hacks

13:52 <tomeu> right, I think that there's some configuration that happens with the meson PM backend that persists across module reloads

13:52 <mchehab> are there too much things missed for G72?

13:52 <tomeu> so if you load kbase first with it, then later the devicetree PM backend is enough

13:53 <tomeu> narmstrong: any ideas?

13:54 <tomeu> right, MALI_PLATFORM_POWER_DOWN_ONLY isn't needed at all apparently

13:56 <tomeu> guess that on meson, because there isn't a regulator then the power is never lost

13:56 <tomeu> so registers keep their previous values

13:57 rhyskidd has joined #panfrost

14:17 <tomeu> ok, confirmed that setting GPU_PWR_KEY and GPU_PWR_OVERRIDE1 is enough to get panfrost probing correctly the shader cores, without any other changes

14:17 <tomeu> but we still have the same instability as before

14:30 <robmur01> tomeu: does suppressing RPM via sysfs help? (I forget how far we've gone down that path before...)

14:30 <tomeu> just tried it, and no :/

14:30 <tomeu> and there's no devfreq

14:30 <tomeu> but when running glmark, it appears as if the first one or two jobs might be problematic, but the rest are fine

14:31 <robmur01> so I guess the question is what exact state this "warm up" behaviour is relative to?

14:32 <robmur01> if we rule out every possible way of anything getting powered off

14:35 <tomeu> robmur01: yeah, that's what I'm wondering now

14:36 <tomeu> afaics, the problem is that the GPU doesn't correctly read the cmdstream

14:36 <tomeu> reading random stuff instead

14:37 <tomeu> DATA_INVALID_FAULT is the most common fault, followed by TILE_RANGE_FAULT and then INSTR_INVALID_ENC

14:37 <tomeu> and after that, faults trying to read from 0x0000000000000000

14:37 <chewitt> timing? .. hardware needs to settle before you can use it

14:37 <chewitt> (random guessing)

14:38 <tomeu> yeah, but what part of the hw?

14:38 <tomeu> daniels suggested poisoned cachesthat need to be flushed

14:38 <robmur01> I think we had a suggestion before that something might be wonky with the caches

14:39 <tomeu> yeah, played quite a bit with sending cache invalidation commands, without luck

14:42 <narmstrong> tomeu: honestly, I have no idea what's going on... I would like to help, really

14:43 <narmstrong> tomeu: I'll try to answer to sky.zhou, but reading the backlog you did try to use platform devicetree without MALI_PLATFORM_POWER_DOWN_ONLY and it worked ?

14:43 <tomeu> narmstrong: yes, but only after having used kbase with the meson platform

14:44 <tomeu> I'm looking at hacking the devicetree platform with PWR_KEY and PWR_OVERRIDE1 to see if it makes it work right away after boot

14:44 <narmstrong> ok, so as you say the GPU_PWR_KEY and GPU_PWR_OVERRIDE1 are needed somehow

14:47 <tomeu> yep

14:47 <narmstrong> trying also on my side

14:48 <tomeu> thanks, guess understanding what those values mean could help

14:48 <tomeu> though it could be unrelated

14:50 <narmstrong> tomeu: confirmed, with only devicetree it locksup, but with only GPU_PWR_KEY and GPU_PWR_OVERRIDE1 a single time works fine

14:51 <narmstrong> tomeu: I suspect the GPU_PWR_OVERRIDE1 changes the default HW-based power control

14:54 <narmstrong> tomeu: https://termbin.com/bljd on our android kernel (with kbase integrated)

14:54 <tomeu> so the HW-based power control might be automatically powering up and down components that may require someu warm-up?

14:54 <narmstrong> tomeu: yes, or maybe it disables auto powering down of some internal components

14:55 <tomeu> but then, what could kbase be doingso it doesn't have this problem...

14:55 <narmstrong> but I don't understand why panfrost doesn't require this for T820, but does for bifrost

14:55 <tomeu> btw, without the override, panfrost failed to put the shader cores up during startup

14:56 <narmstrong> for t820 ? but it still works later on

14:56 <tomeu> ah no, here on the n2

14:56 <narmstrong> ok

14:58 <tomeu> SHADER_READY_LO never changed to the expected value

15:20 stikonas_ is now known as stikonas

15:37 * alyssa hugs her rk3399

15:37 macc24 has quit [Quit: WeeChat 2.8]

15:37 guillaume_g has quit [Quit: Konversation terminated!]

15:49 * robmur01 glares at rk3530 being about a year late and counting

15:53 macc24 has joined #panfrost

15:55 macc24 has quit [Remote host closed the connection]

15:57 macc24 has joined #panfrost

15:58 macc24_ has joined #panfrost

15:58 macc24 has quit [Client Quit]

16:00 macc24_ has quit [Remote host closed the connection]

16:03 macc24 has joined #panfrost

16:08 <stikonas> it will probably take a while before proper rk3530 support will be upstream. rk3399 has been out for years and still some features only just reached upstream

16:10 <stikonas> (I can finally boot from M.2 (pcie) SSD without using sd or mmc cards with Uboot 2020.07)

16:13 <alyssa> stikonas: audio is still borked for rk3399 mainline, unless it was fixed in the last few versions

16:13 <alyssa> headphones/mic

16:14 <stikonas> oh, I haven't tried audio... on rockpro64 there were some changes to split dtbs depending on board revision though... Not sure about which chip you use...

16:14 <alyssa> chroembook

16:14 <alyssa> (gru/kevin)

16:17 <robmur01> as far as the SoC is concerned, "audio" stops at the I2S output - the rest is the board's fault ;)

16:24 <alyssa> bbrezillon: if we submit two batches that depend on each other, we can queue them via NEXT, I think

16:24 <alyssa> but I'm not sure how drm_sched interacts with that

16:25 <alyssa> I guess the idea is to modify the `dependency` callback to drm_sched to skip over the fence for the job slot we're queueing to

16:25 <alyssa> regardless that's transparent to userspace provided userspace specifies the right fences

16:25 <alyssa> and we can still provide FIFO behaviour to userspace

16:25 <alyssa> (at least, the appearance of FIFO behaviour)

16:30 <alyssa> the operative question is, do we have to continue tracking deps on batches that are already submitted?

16:30 <alyssa> with the current uabi, that's a no

16:30 <alyssa> if we want to do entirely explicit deps with no ordering gaurantees in kernel, that's a yes

16:30 <alyssa> but maybe there's a happy meidum

16:32 <bbrezillon> I think that's what I was aiming for initially, which partly explains the way it's implemented

16:32 <alyssa> right, yeah

16:33 <alyssa> I'm just trying to figure out if that's needed the way our hw is structured, or if ensuring the order we submit batches is topologically sorted is enough

16:33 <bbrezillon> the question is, are we good with BO-based dep tracking, or are there use cases which need explicit inter-job deps

16:33 <alyssa> IIRC explicit deps via syncobjs are needed for vk

16:33 <alyssa> so it's not like the api is going anywhere

16:34 <alyssa> but for GL, I'm just really struggling to think of a case where BO-based tracking doesn't cut it, even once we throw concurrent vt/frag jobs and *_NEXT registers into the mix

16:34 <bbrezillon> I meant for the gallium driver

16:35 <alyssa> so I guess rip it out, and in the off chance we need something more, we'll fix it then because inevitably it'll look different anyway so not much is lost?

16:37 <bbrezillon> indeed

16:38 <alyssa> ack

16:39 * alyssa tries to figure it out the better data struct then

16:40 <alyssa> bbrezillon: reading about Rust is making me paranoid about getting the lifetimes right for this ;v

16:42 <bbrezillon> yes, you'll still need an object that last longer than the batch, for BO access tracking

16:43 <bbrezillon> maybe not actually

16:44 <alyssa> Give panfrost_batch a set of parent and children nodes, entirely weak references

16:45 <alyssa> when adding a dep, add it as a child and add yourself as parent to it

16:45 <bbrezillon> if we really on BO to enforce inter-job dep, we can just remove the batch from the bo_access object when it's destroyed

16:45 <alyssa> when destroying a batch, remove yourself as parent from each of your children

17:43 mixfix41 has quit [Ping timeout: 258 seconds]

17:53 stikonas has quit [Remote host closed the connection]

18:09 <alyssa> bbrezillon: Might be artificial, but on an apitrace of chromium, I'm seeing a 10% reduction in frametime with the syncobj overhaul.

18:09 <alyssa> 7 files changed, 66 insertions(+), 195 deletions(-)

18:10 <alyssa> So no complaints here I think

18:46 <alyssa> bbrezillon: Annoyingly, drmSyncobjWait itself is eating up a lot of cpu time

18:46 <alyssa> like, even when it's only called when strictly necessary in panfrost_fence_finish

18:46 remexre has quit [Read error: Connection reset by peer]

18:47 <alyssa> drm_syncobj_wait_ioctl is not a speed demon

18:47 remexre has joined #panfrost

19:09 davidlt has quit [Ping timeout: 264 seconds]

19:13 stikonas has joined #panfrost

19:25 TheMojoMan has joined #panfrost

19:29 TheMojoMan has quit [Ping timeout: 240 seconds]

20:15 buzzmarshall has joined #panfrost

20:16 raster has quit [Quit: Gettin' stinky!]

20:19 cwabbott has quit [Quit: cwabbott]

20:19 cwabbott has joined #panfrost

20:44 raster has joined #panfrost

20:54 karolherbst has quit [Quit: duh 🐧]

20:58 karolherbst has joined #panfrost

21:12 cwabbott_ has joined #panfrost

21:14 cwabbott has quit [Ping timeout: 260 seconds]

21:14 cwabbott_ is now known as cwabbott

22:00 raster has quit [Quit: Gettin' stinky!]

22:49 cwabbott has quit [Quit: cwabbott]

22:49 cwabbott has joined #panfrost