#lima on 2019-10-04 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

00:00 <mardikene193> maybe someone starts to critisize like me, it's a wasted talent and wasted time, different angle being as long as one lives it's ok, psycholohy of humans is hard , and strategy between us is sometimes cumbersome, how ones will be let through while others suffer

00:09 <mardikene193> at time being i've been injured for 20years receiving extra critics for something i did not cause or arrange myself

00:09 <mardikene193> leak is a major one though my body can compensate for the injury enough to stay alive

00:11 <mardikene193> I have treated this as handycap show off which is similar to a freak circuis, where truth can not be easily found since those freaks start to mislead people strategically away from what really happened

00:15 <mardikene193> to play lots of opportunities to someone who is inherently incapable of taking them seems dumb to me while torchering someone who could

00:20 Da_Coynul has joined #lima

00:23 <mardikene193> my uncle says my plan X being injured is better than what most have, subjective still maybe some just lack the need to acheive anything. one guy said people generate thoughts and substances to stay lazy on purpose sometimes.

00:24 <mardikene193> fears and stuff like you have being afraid of something which in my case should not be even justified, i am not the one who has violated.

00:28 Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]

00:28 <mardikene193> libv: motivation you have talked about is a psychological term or symptom, i have learned psychology to some degree, for some reason you start to sometimes revaluate and evaluate more of your stuff when you get issues in some areas

00:28 <mardikene193> which you can not even resolve your own, however while you stay totally unmotivated when those issues were not yet present

00:30 Da_Coynul has joined #lima

00:30 <mardikene193> in other words, it's somewhat human nature tending to want something you can not have as appose to appreaciating the ones you have

00:31 <mardikene193> injury to me was kicking in some thoughts to appreciate what i have more

00:36 <mardikene193> and actually processors circuit seems even nicer :), you see probably cause those structures were blocked by patent instruction queues or windows indexed by program counter and branchable to

00:37 <mardikene193> then still the GPU way of doing it works too, they probably showed off that they do it conceptually somewhat differently which they also did, and that one works great too

00:39 <mardikene193> someone talked about RISC-V this is a cross mutation between VLIW gpu and an everage processor

00:39 <mardikene193> some things work like on gpu while others like on CPU like instruction windows, as the name says it is RISC instead of elbrus CISC , but similar to the last the most

00:40 <mardikene193> yeah RISC-V seems pretty good , i would be capable of doing similar hw, but this one looks good too

00:45 <mardikene193> while on tarascale each SIMDs from CU can execute out-of-order elbrus cpu solved things within a bundle similar way

00:46 <mardikene193> I am fairly sure that on hwacha and Xwache the recent risc-v vector one

00:46 <mardikene193> does solve the VLIW stalls in some way too

00:48 <mardikene193> as it is pretty common practice on RISC machines to have loop buffers for different iterations of loop pipelining

00:53 Da_Coynul has quit [Quit: Textual IRC Client: www.textualapp.com]

01:23 <mardikene193> hmm

01:23 <mardikene193> Although RISC-V was not designed as a base for a pure VLIW machine,

01:23 <mardikene193> approaches. In all cases, the base 32-bit encoding has to be supported

01:23 <mardikene193> VLIW encodings can be added as extensions using several alternative

01:23 <mardikene193> to allow use of any standard software tools.

01:29 embed-3d_ has quit [Remote host closed the connection]

01:30 <mardikene193> MALI gpus should be out-of-order capable too the core is what cwabbot calls barelled and patents too

01:30 embed-3d has joined #lima

01:30 <mardikene193> it is vert deeply pipelined machine as to i understand

01:31 <mardikene193> pipelines are async taking 1/128 cycles each, i.e small cycles

01:31 <mardikene193> again one of the sane ways to do it.

01:38 <mardikene193> https://en.wikipedia.org/wiki/Barrel_processor

02:11 megi has quit [Ping timeout: 240 seconds]

03:16 jrmuizel has joined #lima

03:23 jrmuizel has quit [Remote host closed the connection]

03:28 niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]

03:28 niceplace has joined #lima

03:50 _whitelogger has joined #lima

03:57 dddddd has quit [Remote host closed the connection]

03:59 <MoeIcenowy> anarsoul: should we add a debug option to always flush after drawing?

03:59 <MoeIcenowy> I found vc4 has one

06:03 <mardikene193> http://www.cse.scu.edu/~mwang2/verification/SystemVerilog.pdf pseudo code of verilog scheduling continuous assignments are in active and reactive region

06:03 <mardikene193> https://www.verificationguide.com/p/systemverilog-scheduling-semantics.html

06:57 <mardikene193> so just in case , once again: https://github.com/VerticalResearchGroup/miaow/blob/master/src/verilog/rtl/wavepool/scbd_feeder.v valid_wf=0 issue_vacant=1 than feed_valid=0 since the bitwise of sb_candidates=valid_wf & hungry does not hit, next off issue_vacant=2; hungry=1 feed_valid=0, 011 & 001 in hungry & valid_wf will hit as 1

07:00 <mardikene193> so hence when you issued two first instructions or anywhere on the line two instructions , since the first never will be valid

07:00 <mardikene193> it changes to the last

07:07 <mardikene193> that is the queued computation, but on full length of the pipeline it always feeds a wfid but when the fetch queues are full, since it increments the fetch id with one when you have fetch buffer full and it momentarily matches the bitwise hence

07:12 <mardikene193> so hence you have gotta issue two instructions from the wfid row/line, but only if you issue them in program order any one of the pairs only then will the column be switched

07:14 <mardikene193> this is because whatever you do, the chip counts that on full length of the pipeline when fetch and decode stages are included

07:14 <mardikene193> the next instruction won't be there when you issue the first one

07:19 <mardikene193> it is because the fetch&decode pipe is so deep that even the longest latency operation had been allready finished before new instructions are fetched

07:22 <mardikene193> so to sum it up, even if you were to get incoherent replacements in the queues with long latency ops, once you call the loads clamped and divs with zero operands it has to work out, cause issue is a faster pipeline then fetch & decode

07:23 <mardikene193> so when pinning the functional unit to queues, make sure you run the variable latency ops with a smallest latency determined by dummy operands which do not generate carries and borrows and memory loads clamped

07:35 <mardikene193> you can be entirely sure that new instructions is not there when on startup you allocate untiled buffer

07:40 <mardikene193> every instruction then takes so long to fetch, that all dependencies would be allready issued and column switched to, hence you'd have fantastic round robin placement to the queues

08:47 <mardikene193> uuh may i had read something wrong , couple of hours doing the scbd_feeder.v simulations and i can not get this behavior i described.

08:51 niceplace has quit [Read error: Connection reset by peer]

08:52 jonkerj has quit [Quit: Lost terminal]

08:53 niceplace has joined #lima

09:40 jonkerj has joined #lima

09:47 niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]

09:49 niceplace has joined #lima

09:49 marvs has quit [Ping timeout: 246 seconds]

09:50 marvs has joined #lima

09:55 megi has joined #lima

09:55 niceplaces has joined #lima

09:56 niceplace has quit [Ping timeout: 276 seconds]

09:58 niceplaces has quit [Read error: Connection reset by peer]

09:59 niceplace has joined #lima

10:00 niceplace has quit [Read error: Connection reset by peer]

10:02 niceplace has joined #lima

10:03 niceplace has quit [Read error: Connection reset by peer]

10:06 niceplace has joined #lima

10:10 niceplace has quit [Read error: Connection reset by peer]

10:11 niceplace has joined #lima

10:26 <mardikene193> i got the right results finally, it turned out on cont assignment you need to click on next_hungry assignment so it would show not the resetted value

10:26 <mardikene193> https://www.edaplayground.com/x/3c6h

10:28 <mardikene193> when you click on this at the end of the timeline it will show the feed_wfid being 1 as it suppose to be

10:37 <mardikene193> this looks about correct, based of this slight bisection from miaow you can for insurance do some other tests, I myself need to go, i managed to read it correctly from the beginning so to speak without even doing simulation ontop of this file

10:40 <mardikene193> but be aware that on simulation they have truth tables for doing things over X and Z in the logic however a synthesized hw so to speak in hw, you do not get those values at all, only 1 and zero, this is why i had init all the variables to zero

10:45 <mardikene193> you can modify the issue_vacant to be something other than two , which i marked

10:45 <mardikene193> in that case you are going to get an X as feed_wfid

10:46 <mardikene193> this can be broadcasted as random value in hardware or if it is latched you get the previous value

10:48 <mardikene193> since in that case the decoder_6_to_40.v has no default statement, the synthesis tool indeed infers a latch

10:54 <mardikene193> and now it is time boys to go fuck yourself, perhaps like MrPooper from AMD does in addition, slight fingering yourself to that complicated anal object, please fuckyourself off the planet outsiders!!

10:54 mardikene193 has quit [Quit: Leaving]

11:15 niceplace has quit [Read error: Connection reset by peer]

11:16 niceplace has joined #lima

12:00 <anarsoul> MoeIcenowy: yeah, why not

12:04 <MoeIcenowy> anarsoul: but we still have no clues for the strange misrender...

12:05 <anarsoul> MoeIcenowy: enunes and I spent some time on XDC (mostly enunes) trying different stuff. It's either something's wrong with CF or with spilling

12:05 <MoeIcenowy> anarsoul: so it's still shader issue?

12:05 <anarsoul> e.g. glmark2 -b ideas works fine if you manually unroll loop

12:06 <anarsoul> but it simplifies control flow (no loops anymore) and at the same time it doesn't spill

12:06 <MoeIcenowy> I don't think shader issue explains why unmerging task solves it

12:06 <anarsoul> and with loop it spills

12:06 <anarsoul> MoeIcenowy: maybe previous job corrupts something?

12:07 <MoeIcenowy> maybe we used sth that is not initialized

12:07 <anarsoul> s/job/draw

12:07 <MoeIcenowy> I think use-before-init looks more reasonable than corruption

12:08 <MoeIcenowy> but use-before-init should be easy to discover...

12:08 <MoeIcenowy> maybe some dependency problem we haven't discovered now

12:11 adjtm has quit [Ping timeout: 276 seconds]

12:11 <anarsoul> maybe

12:28 jrmuizel has joined #lima

12:38 jrmuizel has quit [Remote host closed the connection]

12:39 afaerber has quit [Quit: Leaving]

12:41 jrmuizel has joined #lima

12:42 jrmuizel has quit [Remote host closed the connection]

12:43 jrmuizel has joined #lima

12:44 jrmuizel has quit [Remote host closed the connection]

12:51 niceplaces has joined #lima

12:54 niceplace has quit [Ping timeout: 240 seconds]

12:55 niceplaces has quit [Ping timeout: 250 seconds]

13:15 jrmuizel has joined #lima

13:24 adjtm has joined #lima

13:27 niceplace has joined #lima

13:27 tlwoerner has quit [Ping timeout: 276 seconds]

13:35 tlwoerner has joined #lima

13:38 dddddd has joined #lima

13:50 <MoeIcenowy> anarsoul: strange... in this shader some meaningless code is generated

13:50 <MoeIcenowy> it loads $0.x with const 1, and then compare it with const 0

13:52 jrmuizel has quit [Remote host closed the connection]

13:52 <MoeIcenowy> oh the compare can be jumped to

13:52 <MoeIcenowy> so it's okay

13:55 <MoeIcenowy> anarsoul: when is sync needed?

14:04 jrmuizel has joined #lima

14:18 <MoeIcenowy> anarsoul: strange... a texture descriptor seems to started with 0x01800096 as its first word

14:19 <MoeIcenowy> this value should not exist -- it has some bits in unknown_0_1 set, but unknown_0_1 is never touched in the driver

14:24 <MoeIcenowy> oh sorry... failure in converting from hex to bin

14:31 niceplace has quit [Read error: Connection reset by peer]

14:34 niceplace has joined #lima

15:00 <MoeIcenowy> anarsoul: what's load.t and store.t ?

15:06 <anarsoul> load temporary and store temporary

15:06 <anarsoul> they're used for spilling

15:10 <MoeIcenowy> anarsoul: I think by tracing the shader's running

15:10 <MoeIcenowy> I found a load.t -1 without store.t -1

15:10 <MoeIcenowy> (there's a store.t -1 at omitted branch)

15:11 <anarsoul> that's weird

15:13 <anarsoul> can you show nir shader and disassembly?

15:13 <MoeIcenowy> okay

15:14 <MoeIcenowy> network speed is weird here

15:17 <MoeIcenowy> anarsoul: https://pastebin.aosc.io/paste/e-9oacWbLcEprLXOgR0XfA original shader log about this shader

15:17 <MoeIcenowy> https://pastebin.aosc.io/paste/9p8AHryRgkF2PNgIT7e~ow trimmed shader with all uniforms = 0

15:47 <MoeIcenowy> anarsoul: could you recheck this trim? I'm not sure whether I did it correctly

15:47 <MoeIcenowy> I'm easy to make errors

15:48 jrmuizel has quit [Remote host closed the connection]

15:50 drod has joined #lima

15:54 jrmuizel has joined #lima

16:05 <anarsoul> MoeIcenowy: disassembly is valid, basically temporary can be updated only partially, so we have to load it first, then modify, then store

16:18 <MoeIcenowy> anarsoul: so the code is reasonable?

16:41 jrmuizel has quit [Remote host closed the connection]

17:00 drod has quit [Ping timeout: 240 seconds]

17:02 adjtm has quit [Ping timeout: 240 seconds]

17:14 drod has joined #lima

17:21 <MoeIcenowy> anarsoul: BTW finally how does the disassembly set gl_FragColor?

18:19 afaerber has joined #lima

21:39 adjtm has joined #lima

22:04 jrmuizel has joined #lima

22:05 jrmuizel has quit [Remote host closed the connection]

22:17 jrmuizel has joined #lima