ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!
<mardikene193> maybe someone starts to critisize like me, it's a wasted talent and wasted time, different angle being as long as one lives it's ok, psycholohy of humans is hard , and strategy between us is sometimes cumbersome, how ones will be let through while others suffer
<mardikene193> at time being i've been injured for 20years receiving extra critics for something i did not cause or arrange myself
<mardikene193> leak is a major one though my body can compensate for the injury enough to stay alive
<mardikene193> I have treated this as handycap show off which is similar to a freak circuis, where truth can not be easily found since those freaks start to mislead people strategically away from what really happened
<mardikene193> to play lots of opportunities to someone who is inherently incapable of taking them seems dumb to me while torchering someone who could
Da_Coynul has joined #lima
<mardikene193> my uncle says my plan X being injured is better than what most have, subjective still maybe some just lack the need to acheive anything. one guy said people generate thoughts and substances to stay lazy on purpose sometimes.
<mardikene193> fears and stuff like you have being afraid of something which in my case should not be even justified, i am not the one who has violated.
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
<mardikene193> libv: motivation you have talked about is a psychological term or symptom, i have learned psychology to some degree, for some reason you start to sometimes revaluate and evaluate more of your stuff when you get issues in some areas
<mardikene193> which you can not even resolve your own, however while you stay totally unmotivated when those issues were not yet present
Da_Coynul has joined #lima
<mardikene193> in other words, it's somewhat human nature tending to want something you can not have as appose to appreaciating the ones you have
<mardikene193> injury to me was kicking in some thoughts to appreciate what i have more
<mardikene193> and actually processors circuit seems even nicer :), you see probably cause those structures were blocked by patent instruction queues or windows indexed by program counter and branchable to
<mardikene193> then still the GPU way of doing it works too, they probably showed off that they do it conceptually somewhat differently which they also did, and that one works great too
<mardikene193> someone talked about RISC-V this is a cross mutation between VLIW gpu and an everage processor
<mardikene193> some things work like on gpu while others like on CPU like instruction windows, as the name says it is RISC instead of elbrus CISC , but similar to the last the most
<mardikene193> yeah RISC-V seems pretty good , i would be capable of doing similar hw, but this one looks good too
<mardikene193> while on tarascale each SIMDs from CU can execute out-of-order elbrus cpu solved things within a bundle similar way
<mardikene193> I am fairly sure that on hwacha and Xwache the recent risc-v vector one
<mardikene193> does solve the VLIW stalls in some way too
<mardikene193> as it is pretty common practice on RISC machines to have loop buffers for different iterations of loop pipelining
Da_Coynul has quit [Quit: Textual IRC Client: www.textualapp.com]
<mardikene193> hmm
<mardikene193> Although RISC-V was not designed as a base for a pure VLIW machine,
<mardikene193> approaches. In all cases, the base 32-bit encoding has to be supported
<mardikene193> VLIW encodings can be added as extensions using several alternative
<mardikene193> to allow use of any standard software tools.
embed-3d_ has quit [Remote host closed the connection]
<mardikene193> MALI gpus should be out-of-order capable too the core is what cwabbot calls barelled and patents too
embed-3d has joined #lima
<mardikene193> it is vert deeply pipelined machine as to i understand
<mardikene193> pipelines are async taking 1/128 cycles each, i.e small cycles
<mardikene193> again one of the sane ways to do it.
megi has quit [Ping timeout: 240 seconds]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]
niceplace has joined #lima
_whitelogger has joined #lima
dddddd has quit [Remote host closed the connection]
<MoeIcenowy> anarsoul: should we add a debug option to always flush after drawing?
<MoeIcenowy> I found vc4 has one
<mardikene193> http://www.cse.scu.edu/~mwang2/verification/SystemVerilog.pdf pseudo code of verilog scheduling continuous assignments are in active and reactive region
<mardikene193> so just in case , once again: https://github.com/VerticalResearchGroup/miaow/blob/master/src/verilog/rtl/wavepool/scbd_feeder.v valid_wf=0 issue_vacant=1 than feed_valid=0 since the bitwise of sb_candidates=valid_wf & hungry does not hit, next off issue_vacant=2; hungry=1 feed_valid=0, 011 & 001 in hungry & valid_wf will hit as 1
<mardikene193> so hence when you issued two first instructions or anywhere on the line two instructions , since the first never will be valid
<mardikene193> it changes to the last
<mardikene193> that is the queued computation, but on full length of the pipeline it always feeds a wfid but when the fetch queues are full, since it increments the fetch id with one when you have fetch buffer full and it momentarily matches the bitwise hence
<mardikene193> so hence you have gotta issue two instructions from the wfid row/line, but only if you issue them in program order any one of the pairs only then will the column be switched
<mardikene193> this is because whatever you do, the chip counts that on full length of the pipeline when fetch and decode stages are included
<mardikene193> the next instruction won't be there when you issue the first one
<mardikene193> it is because the fetch&decode pipe is so deep that even the longest latency operation had been allready finished before new instructions are fetched
<mardikene193> so to sum it up, even if you were to get incoherent replacements in the queues with long latency ops, once you call the loads clamped and divs with zero operands it has to work out, cause issue is a faster pipeline then fetch & decode
<mardikene193> so when pinning the functional unit to queues, make sure you run the variable latency ops with a smallest latency determined by dummy operands which do not generate carries and borrows and memory loads clamped
<mardikene193> you can be entirely sure that new instructions is not there when on startup you allocate untiled buffer
<mardikene193> every instruction then takes so long to fetch, that all dependencies would be allready issued and column switched to, hence you'd have fantastic round robin placement to the queues
<mardikene193> uuh may i had read something wrong , couple of hours doing the scbd_feeder.v simulations and i can not get this behavior i described.
niceplace has quit [Read error: Connection reset by peer]
jonkerj has quit [Quit: Lost terminal]
niceplace has joined #lima
jonkerj has joined #lima
niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]
niceplace has joined #lima
marvs has quit [Ping timeout: 246 seconds]
marvs has joined #lima
megi has joined #lima
niceplaces has joined #lima
niceplace has quit [Ping timeout: 276 seconds]
niceplaces has quit [Read error: Connection reset by peer]
niceplace has joined #lima
niceplace has quit [Read error: Connection reset by peer]
niceplace has joined #lima
niceplace has quit [Read error: Connection reset by peer]
niceplace has joined #lima
niceplace has quit [Read error: Connection reset by peer]
niceplace has joined #lima
<mardikene193> i got the right results finally, it turned out on cont assignment you need to click on next_hungry assignment so it would show not the resetted value
<mardikene193> when you click on this at the end of the timeline it will show the feed_wfid being 1 as it suppose to be
<mardikene193> this looks about correct, based of this slight bisection from miaow you can for insurance do some other tests, I myself need to go, i managed to read it correctly from the beginning so to speak without even doing simulation ontop of this file
<mardikene193> but be aware that on simulation they have truth tables for doing things over X and Z in the logic however a synthesized hw so to speak in hw, you do not get those values at all, only 1 and zero, this is why i had init all the variables to zero
<mardikene193> you can modify the issue_vacant to be something other than two , which i marked
<mardikene193> in that case you are going to get an X as feed_wfid
<mardikene193> this can be broadcasted as random value in hardware or if it is latched you get the previous value
<mardikene193> since in that case the decoder_6_to_40.v has no default statement, the synthesis tool indeed infers a latch
<mardikene193> and now it is time boys to go fuck yourself, perhaps like MrPooper from AMD does in addition, slight fingering yourself to that complicated anal object, please fuckyourself off the planet outsiders!!
mardikene193 has quit [Quit: Leaving]
niceplace has quit [Read error: Connection reset by peer]
niceplace has joined #lima
<anarsoul> MoeIcenowy: yeah, why not
<MoeIcenowy> anarsoul: but we still have no clues for the strange misrender...
<anarsoul> MoeIcenowy: enunes and I spent some time on XDC (mostly enunes) trying different stuff. It's either something's wrong with CF or with spilling
<MoeIcenowy> anarsoul: so it's still shader issue?
<anarsoul> e.g. glmark2 -b ideas works fine if you manually unroll loop
<anarsoul> but it simplifies control flow (no loops anymore) and at the same time it doesn't spill
<MoeIcenowy> I don't think shader issue explains why unmerging task solves it
<anarsoul> and with loop it spills
<anarsoul> MoeIcenowy: maybe previous job corrupts something?
<MoeIcenowy> maybe we used sth that is not initialized
<anarsoul> s/job/draw
<MoeIcenowy> I think use-before-init looks more reasonable than corruption
<MoeIcenowy> but use-before-init should be easy to discover...
<MoeIcenowy> maybe some dependency problem we haven't discovered now
adjtm has quit [Ping timeout: 276 seconds]
<anarsoul> maybe
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
afaerber has quit [Quit: Leaving]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
niceplaces has joined #lima
niceplace has quit [Ping timeout: 240 seconds]
niceplaces has quit [Ping timeout: 250 seconds]
jrmuizel has joined #lima
adjtm has joined #lima
niceplace has joined #lima
tlwoerner has quit [Ping timeout: 276 seconds]
tlwoerner has joined #lima
dddddd has joined #lima
<MoeIcenowy> anarsoul: strange... in this shader some meaningless code is generated
<MoeIcenowy> it loads $0.x with const 1, and then compare it with const 0
jrmuizel has quit [Remote host closed the connection]
<MoeIcenowy> oh the compare can be jumped to
<MoeIcenowy> so it's okay
<MoeIcenowy> anarsoul: when is sync needed?
jrmuizel has joined #lima
<MoeIcenowy> anarsoul: strange... a texture descriptor seems to started with 0x01800096 as its first word
<MoeIcenowy> this value should not exist -- it has some bits in unknown_0_1 set, but unknown_0_1 is never touched in the driver
<MoeIcenowy> oh sorry... failure in converting from hex to bin
niceplace has quit [Read error: Connection reset by peer]
niceplace has joined #lima
<MoeIcenowy> anarsoul: what's load.t and store.t ?
<anarsoul> load temporary and store temporary
<anarsoul> they're used for spilling
<MoeIcenowy> anarsoul: I think by tracing the shader's running
<MoeIcenowy> I found a load.t -1 without store.t -1
<MoeIcenowy> (there's a store.t -1 at omitted branch)
<anarsoul> that's weird
<anarsoul> can you show nir shader and disassembly?
<MoeIcenowy> okay
<MoeIcenowy> network speed is weird here
<MoeIcenowy> anarsoul: https://pastebin.aosc.io/paste/e-9oacWbLcEprLXOgR0XfA original shader log about this shader
<MoeIcenowy> https://pastebin.aosc.io/paste/9p8AHryRgkF2PNgIT7e~ow trimmed shader with all uniforms = 0
<MoeIcenowy> anarsoul: could you recheck this trim? I'm not sure whether I did it correctly
<MoeIcenowy> I'm easy to make errors
jrmuizel has quit [Remote host closed the connection]
drod has joined #lima
jrmuizel has joined #lima
<anarsoul> MoeIcenowy: disassembly is valid, basically temporary can be updated only partially, so we have to load it first, then modify, then store
<MoeIcenowy> anarsoul: so the code is reasonable?
jrmuizel has quit [Remote host closed the connection]
drod has quit [Ping timeout: 240 seconds]
adjtm has quit [Ping timeout: 240 seconds]
drod has joined #lima
<MoeIcenowy> anarsoul: BTW finally how does the disassembly set gl_FragColor?
afaerber has joined #lima
adjtm has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima