#lima on 2019-10-28 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

00:35 _whitelogger has joined #lima

00:48 warpme_ has quit [Quit: Connection closed for inactivity]

01:14 _whitelogger has joined #lima

03:52 dddddd has quit [Ping timeout: 240 seconds]

04:12 BenG83 has quit [Ping timeout: 240 seconds]

04:29 BenG83 has joined #lima

04:44 megi has quit [Ping timeout: 268 seconds]

04:53 Barada has joined #lima

04:55 Barada has quit [Client Quit]

05:02 Barada has joined #lima

05:08 BenG83 has quit [Ping timeout: 268 seconds]

06:05 BenG83 has joined #lima

06:24 Barada has quit [Quit: Barada]

06:42 Barada has joined #lima

06:46 Barada has quit [Client Quit]

06:53 maccraft123 has joined #lima

06:54 mastermart has joined #lima

06:57 <mastermart> now I have the shader skeletons with correct dependencies on both vertex and fragment programs for 1.5ogl, since the spec does not allow address registers for fragment programs always, i made fragment program slightly differently.

06:59 <mastermart> however additions are needed, cause i can not come up with the way to communicate with command buffers from vertex programs, cause it has not got any method to communicate with it.

06:59 <mastermart> nothing exports to data cache it seems.

07:00 maccraft123 has quit [Ping timeout: 240 seconds]

07:04 BenG83 has quit [Read error: Connection reset by peer]

07:05 BenG83 has joined #lima

07:15 <mastermart> https://www.khronos.org/opengl/wiki/Post_Transform_Cache

07:15 <mastermart> actually there is something, however i have not looked into this.

07:28 wiewo has quit [Ping timeout: 276 seconds]

07:29 <mastermart> the rationale behind this is, when you have millions of calls in the cp, the cpu would start to bottleneck, even when scheduling on CPU is good, cause accessing VRAM of the chip is slow.

07:30 <mastermart> hence you want GPU to feed commands to the CP.

07:30 wiewo has joined #lima

07:30 <mastermart> and more particularly Shader to do that.

07:38 kaspter has quit [Quit: kaspter]

07:39 kaspter has joined #lima

07:57 camus has joined #lima

07:58 kaspter has quit [Ping timeout: 240 seconds]

07:58 camus is now known as kaspter

08:11 camus has joined #lima

08:12 kaspter has quit [Ping timeout: 268 seconds]

08:12 camus is now known as kaspter

08:12 <mastermart> so it is going to be slight more work to be added, command buffers are defined for physicaldevice memory views that can be cached however needed in data cache.

08:17 <mastermart> https://gitlab.freedesktop.org/mesa/mesa/blob/8ab111664a20ae8e833a1dee3eb02f3825627b15/src/intel/vulkan/anv_private.h example of the intel driver using data cache for physical device.

08:18 yann has quit [Ping timeout: 240 seconds]

08:23 <mastermart> I do not know how accisble is t&l as data cache yet, but yeah normally all chips differentiate or separate the last vertex writes from the middle/preceding ones.

08:23 <mastermart> in the vertex shader or program.'

08:23 <mastermart> i have to go now.

08:23 mastermart has quit [Remote host closed the connection]

08:24 warpme_ has joined #lima

08:27 BenG83 has quit [Read error: Connection reset by peer]

08:28 BenG83 has joined #lima

09:24 BenG83 has quit [Read error: Connection reset by peer]

09:24 yann|work has joined #lima

09:50 megi has joined #lima

10:36 maccraft123 has joined #lima

10:49 maccraft123 has quit [Quit: WeeChat 2.6]

11:04 yann|work is now known as yann

12:31 yann has quit [Ping timeout: 268 seconds]

12:45 yann has joined #lima

12:54 dddddd has joined #lima

13:04 megi has quit [Ping timeout: 245 seconds]

13:35 chewitt has joined #lima

13:49 maccraft123 has joined #lima

14:39 maccraft123 has quit [Quit: WeeChat 2.6]

15:22 Tofe has left #lima [#lima]

15:22 Tofe has joined #lima

15:33 maccraft123 has joined #lima

16:12 paulk-leonov has quit [Ping timeout: 245 seconds]

16:17 paulk-leonov has joined #lima

16:31 maccraft123 has quit [Ping timeout: 245 seconds]

16:55 monstr has joined #lima

16:56 megi has joined #lima

17:19 yann has quit [Ping timeout: 276 seconds]

17:27 maccraft123 has joined #lima

18:02 enunes has quit [Quit: ZNC 1.7.2 - https://znc.in]

18:22 monstr has quit [Remote host closed the connection]

18:30 enunes has joined #lima

18:39 <anarsoul> enunes: rellla: any ideas how to debug random failures when running q3a in loop?

18:40 <anarsoul> recording apitrace is not feasible since it can take 30-60mins for issue to reproduce

18:40 <enunes> anarsoul: I found two more leaks while debugging the glmark2 stuff

18:41 <enunes> will send MR

18:41 <enunes> but I think there is 1 more

18:41 <enunes> at least with refract

18:41 <anarsoul> great

18:41 <anarsoul> I suspect we also free some BOs early somewhere

18:42 <anarsoul> but I'm not sure how to debug it other than eyeballing the code

18:42 <enunes> are you monitoring memory consumption while you reproduce those, to see if you might be running out for some random allocation?

18:44 <anarsoul> nope

18:44 <anarsoul> but I don't get any allocation failures

18:51 yann has joined #lima

19:06 mastermart has joined #lima

19:16 <mastermart> mareko and nha once explained it. Rasterizer should post the new interpolated varyings everytime the succeeding stage does final exports, it then drives the bus with new ones, this all is logged on dri-devel

19:18 <mastermart> and 400MP is specialized hardware it has separate vertex and fragment cores, and even registers are given separately from 1024 128bit ones from the regfile.

19:19 <mastermart> which means actually fragment processor can be actively also feeding the CP. or so to speak made to do so.

19:22 <mastermart> but yeah anyways i allready gave VHDL code of the rasterizer a full one and lengthy also.

19:25 <mastermart> it was programmed in korea, by KAIST people...Korea Advanced Institute of Science and Technology.

19:30 <mastermart> I think people should bare in mind that AFAIK. texture memory loads are always vector (as mandated by arb fragment program spec ), the dependent read on vectored load is very easy to be formulated.

19:36 armessia has joined #lima

19:42 <mastermart> You do highly big roundrips at coding very big hunks that never behave intelligently for performance, added you know nothing about hw to run things properly and aside from copying stuff from internet all necessary for your code, you also lack any kind saner creativity which is to say no independent thinking capability in form of critical eye,

19:42 <mastermart> people who only follow all commands blindly are dumb in my book.

19:52 <mastermart> you seem to fall to every trolling and scam where the vendors planningly expect you to. very stupid guys so to speak, very arrogont on top, violative for a reason cause maybe you are not that bad, you are just purest dummies.

19:56 <mastermart> some time ago google informed the computing enthusiasts or community of such, that they developed a quantum chip, and allready on my facebook i get feeds of how to do machine learning on cortex-M processors, so cause of pushed to the corner visually or virtually by google, IBM engineers finally told, that current computers are powerful too, more

19:56 <mastermart> powerful then that of a linux freak could naturally anticipate.

20:01 <mastermart> So one of the use of a tri-state signalling is in the busy_decoder_table.v so the incoming data is SIPO kind, tri-state ternary will reset the assigment to 0 if previous data gets old, and hence still be able to block the vectors register that is current

20:12 afaerber has quit [Quit: Leaving]

20:18 <mastermart> which leads me to thinking do you also know how circular buffers work, at least linked-lists the other famous way alyssa seemed to have read, what is the goal of linked lists.

20:19 <anarsoul> enunes: these are all regular memory leaks, not BO leaks

20:19 <enunes> anarsoul: I know, though I wondered if it was because of them that running in a loop was failing

20:19 <mastermart> cause wikipedia again has a good article about circular buffers and how they work, there is a channel of length of bytes sometimes also called ring

20:19 afaerber has joined #lima

20:19 <enunes> and I was looking for bo leaks and found those anyway

20:20 <mastermart> and when you post data to that ring, you have to tell the tail pointer to increment this is a producer consumer technique

20:20 <mastermart> producer is a CPU and consumer is GPU

20:21 <enunes> valgrind doesn't report anything other than those

20:21 <mastermart> when the channel is full it wraps around the pointers to the the beginning of the ring, when pointers are equal

20:22 <mastermart> wikipedia explains everything needed there also like in the case of linked lists

20:24 <anarsoul> enunes: I don' think that valgrind knows how we allocate BOs

20:25 <anarsoul> IIRC we need to do some annotations to teach it

20:26 <enunes> I mean, I want to get other leaks out of the way so we can be sure that we really have bo leaks

20:26 <anarsoul> see vc4_bufmgr.c

20:33 <mastermart> you want to ban me and ignore me and shit like this, but you gotta learn how to program before, stop this autist crap and think a bit how to maintain proper stack, if you are not going to learn this art, you will receive critisism in greater amounts soon.

20:34 <mastermart> so ringbuffer has start and end pointers and head and tail pointers, four pointers.

20:36 <mastermart> all of them can be programmable but if they are not, well on specialized fragment processor you can have the texture as the size of the ring.

20:36 <mastermart> and overlap it and feed the commands from there.

20:36 <mastermart> if the vertex buffers have limits with instancing enabled that the ring can be larger by default and is somewhat fixed.

20:38 <enunes> anarsoul: yeah ok porting those annotations gives a lot more of reports

20:38 <anarsoul> :)

20:40 <enunes> though it requires more time to go through all this, and I don't know how much cleanup we do at the end to classify these as real bugs

20:40 <enunes> I guess I can just look at whatever is the buffer buffer reported with a few runs

20:40 <enunes> bigger buffer

20:41 <anarsoul> I think the biggest for glmark2 would be ctx buffer

20:41 <anarsoul> it's 1mb

21:25 <mastermart> https://kernel.googlesource.com/pub/scm/linux/kernel/git/jic23/parrot/+/master%5E%21/

21:38 <mastermart> sunxi kernel drm kernel is still being processed, i have not found the sweetspots there

21:40 <anarsoul> enunes: can you review https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2164 ?

21:58 <mastermart> really that parrot code is not very good for scheduling jobs either.

21:59 <enunes> anarsoul: I can review it, but later

21:59 <mastermart> very bulky, highly pointless.

21:59 <anarsoul> OK, thanks

22:01 <mastermart> lot of kernel is nowdays bloated all do it very wrong, basically a flood of shit.

22:03 <anarsoul> it looks good to me, just want to get second opinion on compiler change

22:05 maccraft123 has quit [Ping timeout: 245 seconds]

22:06 <mastermart> I am going to be pissed off, my webrowser nor my eyes can not tolerate such crap. Kernel plumbing just like any other code needs to performsnt and performant code is short.

22:07 <mastermart> so most of this crap spanning around millions of lines need to be removed from kernel.

22:28 mastermart has quit [Remote host closed the connection]

23:05 armessia has quit [Quit: Leaving]