#lima on 2020-02-25 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

01:02 yuq825 has joined #lima

03:26 megi has quit [Ping timeout: 258 seconds]

03:51 niceplaces has quit [Ping timeout: 260 seconds]

03:51 minicom has quit [Ping timeout: 260 seconds]

03:51 minicom1 has joined #lima

03:51 niceplace has joined #lima

03:53 bshah has quit [Ping timeout: 260 seconds]

03:54 <anarsoul> yuq825: so looks like we're PP bound in most cases?

03:55 bshah has joined #lima

04:19 buzzmarshall has quit [Remote host closed the connection]

04:35 _whitelogger has joined #lima

04:39 chewitt has quit [Quit: Zzz..]

05:33 Barada has joined #lima

05:39 Barada has quit [Quit: Barada]

05:41 gcl_ has quit [Ping timeout: 240 seconds]

05:41 gcl has joined #lima

05:45 <MoeIcenowy> anarsoul: personally I think so

06:02 chewitt has joined #lima

06:03 chewitt has quit [Client Quit]

06:09 <yuq825> yeah, pp GPU time is obviously too long

06:09 chewitt has joined #lima

06:11 <yuq825> time between two gp task submit is ~70ms, gp+pp task GPU time is ~50ms

06:12 <anarsoul> yuq825: likely because it takes some time to generate cmd stream?

06:12 <yuq825> you mean the 20ms CPU time?

06:13 chewitt has quit [Client Quit]

06:13 <yuq825> 50ms is pure GPU time

06:14 <anarsoul> yuq825: yeah

06:14 <anarsoul> flamegraph of ioq3 shows that lima_bo_wait() is pretty expensive

06:15 <anarsoul> it's 21% of time spent in lima_draw_vbo()

06:15 <anarsoul> also lima_job_add_bo() itself is expensive

06:17 <anarsoul> https://gitlab.freedesktop.org/snippets/892

06:20 <anarsoul> yuq825: I think using "util_dynarray_grow" in lima_job_add_bo() likely makes it slow

06:20 <yuq825> lima_bo_wait() will wait for task finish, if the task is slow (GPU time), that's expected

06:21 <anarsoul> yuq825: it uses zero timeout, so it should return immediately and indicate whether BO is busy or not

06:21 <anarsoul> it's called from lima_bo_create()

06:21 <yuq825> ok

06:23 <yuq825> try set give a bigger init size to bo array or build a cache for lima_job may work

06:23 <anarsoul> yeah

06:24 <yuq825> may be add some trace point to check the lima_bo_wait

06:28 <yuq825> oh, your flamegraph already has kernel time

06:30 <anarsoul> yes

06:31 <anarsoul> I'll send an MR that preallocated dynarrays for job BOs in few mins

06:48 monstr has joined #lima

06:48 dddddd has quit [Ping timeout: 240 seconds]

06:52 <anarsoul> rellla: please collect tags for 3884 and merge it

07:18 Elpaulo has quit [Read error: Connection reset by peer]

07:19 Elpaulo has joined #lima

07:42 <anarsoul> yuq825: I think it spends most of time in util_dynarray_foreach()

07:43 <anarsoul> so preallocating dynarrays doesn't help

07:49 <anarsoul> guess we need another hash table to check whether BO is already in a job?

07:51 <rellla> anarsoul: done

07:52 <anarsoul> rellla: thanks!

07:52 <rellla> btw, i noticed that a full deqp run is terribly slow with lima compared to the blob :)

07:53 <rellla> i have not measured anything but i'm feeling blob is ~10 times faster ...

08:00 yuq825 has quit [Remote host closed the connection]

08:00 yuq825 has joined #lima

08:03 <anarsoul> rellla: do you compile lima with debug?

08:03 <anarsoul> if yes, it does a ton of extra validations

08:03 <rellla> anarsoul: :/

08:03 <rellla> probably :)

08:09 yann has quit [Ping timeout: 265 seconds]

08:41 _whitelogger has joined #lima

08:43 <anarsoul> beside that lima_update_textures() is expensive

08:44 <anarsoul> with lima_texture_desc_set_res() taking 1/3 time

09:00 Elpaulo has quit [Quit: Elpaulo]

09:04 yann has joined #lima

09:45 <MoeIcenowy> anarsoul: should I make a MR to expose derivatives?\

09:50 <enunes> I have still been working in ppir scheduler and instruction combine optimizations fyi, hopefully that helps pp execution time

10:01 minicom1 is now known as minicom

10:21 <MoeIcenowy> WHAT? enabling derivatives doesn't lead to CI failure

10:21 <rellla> MoeIcenowy: should it?

10:22 <MoeIcenowy> I think it's exposing new feature

10:23 <rellla> "PIPE_CAP_TGSI_FS_FINE_DERIVATIVE: Whether the fragment shader supports

10:23 <rellla> the FINE versions of DDX/DDY."

10:23 <rellla> what are the "FINE" versions?

10:33 megi has joined #lima

10:41 <plaes> rellla: https://fgiesen.wordpress.com/2011/07/10/a-trip-through-the-graphics-pipeline-2011-part-8/#comment-1990

10:42 <rellla> plaes: yeah, found that. does mali support that?

10:42 <plaes> no clue :(

10:43 <plaes> but it should be possible to create a test for that? :P

10:53 <rellla> i think, we need piglit or a selfmade test to test this, because it's not in the dEQP-GLES2 mustpass series. dFdx and dFdy are shading language 3.00+

11:00 <plaes> piglit stuff - https://github.com/mesa3d/piglit/tree/master/tests/spec/arb_derivative_control/execution

11:02 <MoeIcenowy> rellla: it's an ext in 1.0

11:12 <rellla> MoeIcenowy: but we don't have a deqp test for it, do we?

11:40 <MoeIcenowy> I don't know

12:08 <rellla> anarsoul: i disabled debug build of deqp and now it's much faster :p

13:07 dddddd has joined #lima

13:37 gcl has quit [Ping timeout: 258 seconds]

13:39 gcl has joined #lima

13:49 yuq825 has quit [Quit: Leaving.]

15:01 monstr has quit [Remote host closed the connection]

15:41 <rellla> finally got ETC1 fixed

16:39 megi has quit [Ping timeout: 260 seconds]

16:45 <anarsoul> rellla: great!

16:45 <anarsoul> MoeIcenowy: yeah, go ahead

16:46 <anarsoul> I don't see why derivatives shouldn't be exposed

17:20 yann has quit [Ping timeout: 255 seconds]

17:53 yann has joined #lima

18:02 yann has quit [Ping timeout: 255 seconds]

18:09 megi has joined #lima

18:54 gcl_ has joined #lima

18:56 gcl has quit [Ping timeout: 240 seconds]

18:56 yann has joined #lima

21:34 buzzmarshall has joined #lima

22:08 <anarsoul> enunes: btw if you're doing PP rework please make several incremental MRs if possible

22:09 <anarsoul> otherwise it would be hard to review it

22:09 <enunes> yes definitely

23:36 enunes has quit [Ping timeout: 240 seconds]

23:49 enunes has joined #lima