#panfrost on 2019-11-27 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:17 warpme_ has quit [Quit: Connection closed for inactivity]

00:35 stikonas has quit [Ping timeout: 252 seconds]

01:24 NeuroScr has quit [Quit: NeuroScr]

01:37 NeuroScr has joined #panfrost

01:45 megi has quit [Ping timeout: 250 seconds]

01:51 marcodiego has joined #panfrost

02:01 NeuroScr has quit [Ping timeout: 268 seconds]

03:00 kaspter has quit [Ping timeout: 250 seconds]

03:00 kaspter has joined #panfrost

03:47 nerdboy has quit [Ping timeout: 265 seconds]

04:12 Lyude has quit [Read error: Connection reset by peer]

04:13 Lyude has joined #panfrost

04:32 davidlt has joined #panfrost

04:44 marcodiego has quit [Quit: Leaving]

05:52 NeuroScr has joined #panfrost

05:53 nerdboy has joined #panfrost

06:24 cowsay has joined #panfrost

06:26 cowsay_ has quit [Ping timeout: 245 seconds]

07:26 guillaume_g has joined #panfrost

07:35 yann|work has quit [Ping timeout: 276 seconds]

08:25 warpme_ has joined #panfrost

08:55 camus has joined #panfrost

08:55 kaspter has quit [Ping timeout: 240 seconds]

08:55 camus is now known as kaspter

09:03 stikonas has joined #panfrost

09:27 EmilKarlson has quit [Read error: Connection reset by peer]

09:27 TheCycoONE2 has quit [Write error: Connection reset by peer]

09:27 flacks has quit [Write error: Connection reset by peer]

09:27 thefloweringash has quit [Write error: Connection reset by peer]

09:27 eballetbo[m] has quit [Remote host closed the connection]

09:41 <tomeu> alyssa: looks awesome!

09:51 yann|work has joined #panfrost

09:55 ckeepax has quit [Quit: WeeChat 2.2]

09:58 megi has joined #panfrost

09:59 ckeepax has joined #panfrost

10:13 chewitt has quit [Quit: Zzz..]

10:18 flacks has joined #panfrost

10:18 eballetbo[m] has joined #panfrost

10:18 thefloweringash has joined #panfrost

10:18 EmilKarlson has joined #panfrost

10:18 TheCycoONE1 has joined #panfrost

10:26 raster has joined #panfrost

10:33 maccraft123 has joined #panfrost

11:09 maccraft has joined #panfrost

11:11 maccraft123 has quit [Ping timeout: 268 seconds]

11:25 chewitt has joined #panfrost

11:46 chewitt has quit [Quit: Zzz..]

12:04 chewitt has joined #panfrost

12:10 chewitt has quit [Quit: Zzz..]

12:28 chewitt has joined #panfrost

12:42 <alyssa> tomeu: ...and let's give it a swing.

12:56 <tomeu> alyssa: a few surprising faults in this job: https://lava.collabora.co.uk/scheduler/job/2071800

13:05 <alyssa> tomeu: Surprising how?

13:05 <alyssa> Very often tests that fail are tests that fault

13:06 <alyssa> e.g. the FBO tests

13:17 chewitt has quit [Quit: Zzz..]

13:18 chewitt has joined #panfrost

13:23 <tomeu> alyssa: when I enabled kernel messages the other day, no faults were visible

13:23 <tomeu> and I think an earlier run today showed just one gpu timeout

13:27 <alyssa> tomeu: We're running new tests because of the skip list, no..?

13:28 <tomeu> but I should have been running with the same skip list already

13:28 <alyssa> :|

13:29 <bbrezillon> robher: I think we have a problem with the gem_close() logic

13:30 <bbrezillon> userspace might close GEMs that are still referenced by the GPU

13:30 <bbrezillon> (because the job has been queued but not yet executed or is being executed)

13:31 <bbrezillon> in the panfrost_gem_close() function, we tear down the MMU mapping and release the mm_node

13:31 <bbrezillon> which leads to pagefaults

13:31 <bbrezillon> 2nd problem we have is with the GEM shrinker

13:32 <alyssa> tomeu: the good news is I have some code for you to debug ;)

13:32 <alyssa> lava-ci-small-tiling

13:32 <alyssa> It's rather crude but it should implement all the cases I know of.

13:33 <alyssa> No heuristic but should bring to parity with T860 tiler code.

13:33 <bbrezillon> AFAICT, drm_gem_shmem_is_purgeable() does not take the fact that the BO might still be used by the GPU, even though userspace marked it as purgeable

13:35 <tomeu> alyssa: nice! will integrate

13:36 <tomeu> alyssa: if we run tests that cause faults, then we should rerun failed tests individually at the end, otherwise random tests will randomly fail at random

13:37 <tomeu> because in the skips file we don't only have flip-flops, but more importantly tests that cause otherwise stable tests to flip-flop

13:37 <tomeu> with the number of failures that we have atm, rerunning tests should cost us only a couple of seconds

13:38 <tomeu> when we start running on gles3 it will be a different matter, but maybe we'll want to start with a massice skips file there

13:42 <alyssa> Meep.

13:44 <alyssa> tomeu: lmk if that branch, like, breaks everything

13:44 <alyssa> it is 100% untested as far as t720 goes ;p

13:44 <tomeu> alyssa: regarding "...We really need a quirks framework...", I thought we would be going with something ala MIDGARD_ADVANCED_TILING_UNIT for 720, 820 and 830

13:44 <alyssa> but I trust you'll figure out how to debug it :)

13:46 <tomeu> alyssa: want me to take the skips and tiling branches into my next MR?

13:46 <tomeu> as it's all interdependent

14:03 NeuroScr has quit [Ping timeout: 276 seconds]

14:14 fysa has joined #panfrost

14:15 fysa has quit [Remote host closed the connection]

14:17 <robmur01> bbrezillon: yeah, raster and I have been noticing that too - one thought was that the job might need to hold a reference on the AS, to prevent that being pulled out from underneath still-referenced BOs

14:21 <bbrezillon> robmur01: AFAICT you'd need more than that

14:25 <alyssa> tomeu: Sure

14:26 <alyssa> tomeu: Besides those, is there any upstreaming left?

14:26 <alyssa> (and CI?)

14:34 <tomeu> alyssa: I think that's all

14:35 <tomeu> alyssa: want me to add a MIDGARD_ADVANCED_TILING_UNIT quirk?

14:35 <tomeu> alyssa: looks like I should start debugging the tiling patch :)

14:38 <alyssa> tomeu: :)

14:39 <alyssa> If you want to add a screen->quirks field more broadly that covers both tiling and SFBD and eventually errata that could be done

14:39 <alyssa> Or if we want features/issue testing separately like the kernel

14:39 * alyssa shrugs

14:40 <tomeu> alyssa: I thought our quirks thing would cover all of that

14:41 <alyssa> tomeu: Sounds good

14:45 <tomeu> ack!

14:48 <alyssa> quack!

15:04 guillaume_g has quit [Quit: Konversation terminated!]

15:06 guillaume_g has joined #panfrost

15:14 chewitt has quit [Quit: Zzz..]

15:19 <tomeu> alyssa: is it expected that the polygon list size for 0x41 is just 0x200? the blob uses 0xff200

15:21 NeuroScr has joined #panfrost

15:23 <tomeu> hmm, maybe it was:

15:23 <tomeu> - unsigned raw = pan_tile_count(width, height, tw * 2, th * 2);

15:23 <tomeu> + unsigned raw = pan_tile_count(width, height, tw * 2, th * 2) * k;

15:23 <tomeu> this particular test works now, though we're using a size of 0x8200 instead of 0xff200

16:03 <tomeu> alyssa: this may pass CI: https://gitlab.freedesktop.org/tomeu/mesa/commit/d4e1debf59e171c3c9487d299bf96a731135f3a7

16:04 <tomeu> oops, not yet

16:04 <tomeu> we start getting this after a good while:

16:04 <tomeu> [ 148.068695] panfrost 1800000.gpu: js fault, js=1, status=OUT_OF_MEMORY, head=0x4860f80, tail=0x4860f80

16:04 <tomeu> maybe a too small polygon list? will continue tomorrow

16:28 maccraft123 has joined #panfrost

16:31 maccraft has quit [Ping timeout: 268 seconds]

16:33 maccraft123 has quit [Ping timeout: 250 seconds]

16:35 maccraft123 has joined #panfrost

16:53 guillaume_g has quit [Quit: Konversation terminated!]

17:16 maccraft123 has quit [Quit: WeeChat 2.6]

17:17 maccraft123 has joined #panfrost

17:19 fysa has joined #panfrost

17:43 ph5 has joined #panfrost

17:44 chewitt has joined #panfrost

17:58 raster has quit [Ping timeout: 250 seconds]

17:58 raster has joined #panfrost

18:00 maccraft123 has quit [Ping timeout: 250 seconds]

18:02 maccraft123 has joined #panfrost

19:24 fysa has quit [Remote host closed the connection]

19:43 BenG83 has joined #panfrost

19:49 anarsoul has quit [Ping timeout: 246 seconds]

19:49 BenG83 has quit [Remote host closed the connection]

19:49 kherbst is now known as karolherbst

19:55 BenG83 has joined #panfrost

19:58 anarsoul has joined #panfrost

20:09 phh has quit [*.net *.split]

20:11 davidlt has quit [Ping timeout: 246 seconds]

20:15 phh has joined #panfrost

20:17 ph5 has quit [Ping timeout: 265 seconds]

20:18 ph5 has joined #panfrost

20:25 BenG83 has quit [Remote host closed the connection]

20:30 <alyssa> tomeu: Let's see

20:31 <alyssa> tomeu: Ah yes, that was supposed to have a *k, typo! :(

20:58 stikonas has quit [Remote host closed the connection]

21:00 stikonas has joined #panfrost

21:04 raster has quit [Remote host closed the connection]

21:06 raster has joined #panfrost

21:07 raster has quit [Remote host closed the connection]

21:15 maccraft123 has quit [Quit: WeeChat 2.6]

21:29 <alyssa> tomeu: Definitely could be too small, yes

21:30 <alyssa> At 1920x1080, with 0x41, we expect 0xff200

21:30 <alyssa> (0x200 * ceil(1920 / (16 * 2)) * ceil(1080 / (16 * 2))) + 0x200 = 0xff200

21:31 <alyssa> If we're not calculating that it's a bug in pan_tiler.c

21:31 <alyssa> let's take a peak

21:35 <alyssa> With the *k fix we're generating 0xff200?

21:36 <alyssa> I feel much less confident about the body_offset formulas, though.

21:36 <alyssa> More data could be useful there.

21:36 maccraft123 has joined #panfrost

21:41 maccraft has joined #panfrost

21:45 maccraft123 has quit [Ping timeout: 268 seconds]

22:07 ph5 has quit [Quit: -_-]

22:21 cowsay has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]

23:31 megi has quit [Ping timeout: 252 seconds]

23:32 megi has joined #panfrost

23:56 avaf has joined #panfrost