ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
drod has quit [Remote host closed the connection]
mardestan has quit [Ping timeout: 245 seconds]
jrmuizel has quit [Remote host closed the connection]
nerdboy has quit [Ping timeout: 245 seconds]
nerdboy has joined #lima
megi has quit [Ping timeout: 276 seconds]
nerdboy has quit [Ping timeout: 268 seconds]
paulk-leonov has quit [*.net *.split]
gtucker has quit [*.net *.split]
paulk-leonov has joined #lima
dllud has quit [Ping timeout: 240 seconds]
dllud has joined #lima
gtucker has joined #lima
_whitelogger has joined #lima
dddddd has quit [Remote host closed the connection]
_whitelogger has joined #lima
nerdboy has joined #lima
nerdboy has quit [Changing host]
nerdboy has joined #lima
nerdboy has quit [Ping timeout: 258 seconds]
nerdboy has joined #lima
<MoeIcenowy> anarsoul: here when trying to use lima to render plasma-mobile (w/ desktop GL)
<MoeIcenowy> the navigation bar is compressed to around 1/3 size vertically
<MoeIcenowy> do you know what happened?
nerdboy has quit [Ping timeout: 265 seconds]
nerdboy has joined #lima
<MoeIcenowy> oh it's the issue of the KWIN using desktop GL to composite
<bshah|matrix> MoeIcenowy: if qt is built with gles, it will use gles otherwise gl
<bshah|matrix> You can try forcing gles by KWIN_COMPOSE=O2ES
<MoeIcenowy> bshah|matrix: yes I tried
<MoeIcenowy> and found that navigation bar is now okay
<MoeIcenowy> however it looks like lima still cannot render applications correctly
nerdboy has quit [Excess Flood]
nerdboy has joined #lima
nerdboy has quit [Changing host]
nerdboy has joined #lima
nerdboy has quit [Ping timeout: 268 seconds]
nerdboy has joined #lima
<Tofe> it seems Qt has a little bug in its altas texture format
<Tofe> it wants to use BGRA for them, even if we are using GLES < 3
<Tofe> and on the Mesa side, it checks that: "OpenGL ES 1.x and OpenGL ES 2.0 impose additional restrictions on the internalFormat"
<Tofe> it doesn't solve any of my issues, but well, can't harm to fix it...
<Tofe> or is it up to the Mesa driver not to advertise GL_EXT_texture_format_BGRA8888 when it only supports GLES 2.0 ? I don't know
drod has joined #lima
mardestan has joined #lima
<mardestan> I faced some type of haizy and sleepiness or tiredness, so i could not formulate it very well, how those methods would function. However on r300 since the destination is not redirectable, the two consequent instructions need to have raw dependency, when you combine them with in or out of range behavior, you can control
<mardestan> wether you compute two or one alu, depending whether you redirect the depndent read register upfront.
<mardestan> when you think correctly about that method, in the end what it does is partition stuff as odd and even waves on the waveline
<mardestan> than yeah you can save all the load instructions , so several methods possible, but load instructions usage would not matter so much anyways
<mardestan> I randomly rather than regularly look onto channel logs too, and i try to comment if something goes too wrong for developers, like if on the IRC statements seem to be spottably wrong.
<mardestan> karol and mannerov and Kayden have been doing it sometimes, others are rather punctual, mareko makes lot of statements that have techinical accuracy in them, not to mention tstellar and such
<mardestan> generally AMD open source team is composed of pretty smart guys, but they get paid too, so this also is understandable, i think they all know how the performant path would function
<mardestan> it is just they can not bite their hand which feeds them, this is not in AMDs interest to give the
<mardestan> overly performand code out
nerdboy has quit [Ping timeout: 276 seconds]
cp- has quit [Quit: Disappeared in a puff of smoke]
mardestan has quit [Remote host closed the connection]
mardestan has joined #lima
cp has joined #lima
mardestan has quit [Client Quit]
dddddd has joined #lima
cp has quit [Ping timeout: 240 seconds]
megi has joined #lima
drod has quit [Remote host closed the connection]
jrmuizel has joined #lima
jbrown has quit [Ping timeout: 246 seconds]
cp has joined #lima
<MoeIcenowy> Tofe: where is the additional restriction?
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
drod has joined #lima
jbrown has joined #lima
<MoeIcenowy> anarsoul: BTW could you find any possible reason for lima only rendering a part of UI items?
<MoeIcenowy> When I run a simple Qt QML program which contains a label and a textfield
<MoeIcenowy> If I touch the textfield to trigger the cursor, the label start to flash (when the cursor is shown, the label is not)
<anarsoul> Tofe: mali4x0 supports both rgba and bgra
<anarsoul> MoeIcenowy: no idea, debug it?
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Read error: Connection reset by peer]
<MoeIcenowy> anarsoul: any tips on debugging?
<anarsoul> not really
<anarsoul> try recording apitrace
<anarsoul> and then dissect it in qapitrace on your laptop
<anarsoul> to see what exactly Qt does
<anarsoul> MoeIcenowy: btw I just checked and looks like lima doesn't have stencil implemented yet. So check if Qt QML uses stencil - if it does rendering will be incorrect
<Tofe> MoeIcenowy: I also have some weird behavior when the cursor is involved
<Tofe> (on my QML compositor)
<Tofe> anarsoul: still, we end up with Mesa advertising BGRA support and forbidding its use, it's quite conflicting
<Tofe> but I don't care that much right now, it doesn't seem to have any incidence on my rendering
<anarsoul> Tofe: well, lima support either for rendering and sampling, I'm not sure why you can't use it
<Tofe> anarsoul: it's just because of Mesa having introduce some internal checks w.r.t. GLES version
jrmuizel has joined #lima
<MoeIcenowy> anarsoul: oops, no stencil?
<anarsoul> no, but now you have a nice task for lima! :)
<anarsoul> it needs to be REd since it's not in limare. Qiang started it but never completed
jrmuizel has quit [Remote host closed the connection]
<MoeIcenowy> anarsoul: seems not related to stencil
<MoeIcenowy> buggy frames has only glDisable(GL_STENCIL_TEST)
<MoeIcenowy> GL_BLEND and GL_SCISSOR_TEST are used
<anarsoul> scissors should work
<anarsoul> so is blend
<MoeIcenowy> BTW, seems that the disappeared item is rendered early
<MoeIcenowy> is it possible that latter glDrawElements erased it?
<MoeIcenowy> (I'm looking at GL_BACK
<MoeIcenowy> (newbie to qapitrace
jbrown has quit [Ping timeout: 268 seconds]
jbrown has joined #lima
jbrown has quit [Ping timeout: 240 seconds]
_whitelogger has joined #lima
mardestan has joined #lima
<mardestan> It's like I have capabilities to whitewash many of the world civilians, meaning I can overplay them, it's been some long time where i just do not want to do that regularly in a degree as in the past. Also maybe today i even can't, hardly matters.
<mardestan> Because it produced biggest problems in my life overall.
<mardestan> I read about elbrus CPU which is little bit similar to Itanium albeit still different, and less similar to VLIW based gpus, i do understand very well that it was the right arch to go on with, especially sure when all the patents read by me, it is just the technological competition where someone gets somewhat trashed and dirtied cause one does things correctly
<mardestan> So when all the world generates binaries after some decisions to AMD64 then elbrus inventors with less of a resources get somewhat left behind, even though their arch was largely more correct hw wise.
<mardestan> And how can a babies worth of a brain state that VLIW was a mistake, this looks like a retarded statement to me, cause i have functional brain.
<mardestan> You gave the hw authors at HP and intel to work out hardware in some way, to reach to get employed to develop hardware code is a long process of self-education
<mardestan> so it is somewhat careless and retarded to tell, that those did not know what they were doing at all.
<mardestan> So we have a german guy who tells kepler is a mistake and a reverse engineering austrian guy who made most the code for the shaders in nouveau codebase
<bshah|matrix> MoeIcenowy: about disappearing cursor. Do you know if it's hitting gpir cf error?
<mardestan> even this guy told, that kepler is the best arch he has done work on
<mardestan> and nvidia kepler series is a VLIW GPU
<mardestan> I make you an anology when you are ranked 60 in tonga, then you can't state that mark selby , ronnie o'sullivan do not know what they are doing on the baize right?
jbrown has joined #lima
<mardestan> bshah|matrix: do you have mali hardware available arm gpu that is phalanx in the past it was?
<mardestan> instead of talking about such a bug , why not throwing some lines of code to the driver, do you know what control flow does on gpu?
<mardestan> do not answer me that it jumps to certain location or does true branching, i ask what this means in hw technically?
<mardestan> branches are either divergent or non-divergent on gpu hw, what it does is: some hw has fetch counter aka program counter in hw which is hidden wires in miaow
<mardestan> when decoder decodes a branch instruction, this is delegated to fetch module
<mardestan> and the fetch counter is changed to this PC or location in virtual address space, then this instruction is decoded
<mardestan> it is actually a worthless thing in my code, but it is just a bunch of opcodes that need to be reverse engineered with a decompiler
<mardestan> now since mali has offline assembler
<mardestan> this is rather easy task, but i won't do that, cause i do not ever need branches.
<mardestan> but branches work entirely differently on CPU hardware
nerdboy has joined #lima
<mardestan> bshah|matrix: it is because on CPU there is some branch prediction hw, plus a different kind of instruction queue structures
<mardestan> meaning that on CPU they are of use inside the hardware queue branching too
nerdboy has quit [Changing host]
nerdboy has joined #lima
<mardestan> when on CPU you branch in one of the in-order pipeline way to some instruction within a queue, cause instruction window on CPU is indexed after PC i.e program counter value which is no longer a fetch counter in this case
<mardestan> then yeah it branches to the entry without redecoding the instruction
<mardestan> this can not be done at all on a GPU that way
<mardestan> cause the structure if instr_tables is different on GPUs
<mardestan> i would had written the code for you if branching gives any value to the real code
<mardestan> or would give so to speak
<mardestan> technically it means branching is somewhat mistake even to call that way on gpu, and this is something that a sane man would tell you too
<mardestan> it can only screw up the in queue computation , without any benefits...things would be somewhat different if you expose jtag on GPUs
<mardestan> but currently at least in miaow branching would not function without changes done , yeah branching is in any case absolute crap on gpus
<mardestan> so once again, you implement branching in different ways, you load the opcode either from data cache or whatever memory
<mardestan> you seek into a queue entry, like i was saying 16x16 arbiter has 256 queue entries for instance
<mardestan> so you have 16 rows and 16 columns in the queue
<mardestan> queue nr1.
<mardestan> or line nr1: wfid1 wfid2 wfid3 ...... wfid16
<mardestan> those identifiers have corresponding instructions in their slots
<mardestan> simd arbiter arbitrates among them in a way
<mardestan> is the 1 instruction executable , yeah it always is
<mardestan> is the 2 executable etc. if 3 is not, on SIMD this will skip
<mardestan> until the scoreboard gives green light to it
<mardestan> now in queues if we executed 1 and 2
<mardestan> in a row
<mardestan> or in sequence
<mardestan> it jumps to line1/row1 and fetches the 3 from there instead
<anarsoul> MoeIcenowy: you can trim the trace
<anarsoul> i.e. you can delete later glDrawElements calls, replay trace on lima and see if it fixes the issue
<mardestan> it is just that kind of method can be expressed with matrixes , but it is too complex to explain it that way
<mardestan> words or pseudo code is a lot more easier
<mardestan> now, what i propose is not a program order queues, since shader can be with more instructions that fit into the queues right, and as they can not be replaced
<mardestan> then we need to every time seek to the FU of a certain ALU in queues following such base rules
<mardestan> since the instr_info_table is 16x16 queues 256 entries on some reference gpu
<mardestan> it has dual table one general multiplexer and one decoded in order stuff...
<mardestan> if you compute in program order in the queues, then it does not matter if the scheduler is greedy-then-oldest or round-robin
<mardestan> since the tables have their hidden verilog instances, and everything works transparently
<mardestan> but what i do is a bit more complex version, i pin into the queues 4movs 4muls 4adds etc.
<mardestan> and will seek to there following such base rules, that every line wraps around
<mardestan> and two instructions in sequence will change the column to the last simd arbiters wfid
<MoeIcenowy> bshah|matrix: I have some patches to workaround GPIR CF problem.
<MoeIcenowy> anarsoul: just trim glDrawElements will work?
<anarsoul> try it?
<mardestan> and how do you pin FUs into the queues, this god damn it can not be patented, since there are gazillion of ways
<MoeIcenowy> anarsoul: looks like it doesn't work
<mardestan> this is where i agree with mannerov , filing a patent is waste of money there and playing a trollish clown ontop
<mardestan> provided that i have lost 5 court trials out of 6, just to mention
<mardestan> they invite some absolute monkey to talk absolute nonsense about my rights and are destined to violate them, being themselves around million years behind in evolution
<mardestan> losing probably in anywhere without talking absolute lies and crap
<mardestan> for a developed person or advanced mind, court of law is nothing useful to be dealing with
<mardestan> you see a handycap with a twisted mind 1000times wrong and never really have even come close to make sense, and you see yourself violated straight for hours and get your sentence because one is handycap born
<mardestan> just a john nash theory, talking to idiots is not useful
<MoeIcenowy> anarsoul: oh interesting... I can just analyze the apitrace with lima by using remote function of qapitrace
<mardestan> i know prefront or upfront, that entering into any type of relations with such crew i can only lose bad
<MoeIcenowy> The draw of the missing text label just failed on lima
<MoeIcenowy> oops, for debugging looks like !1415 is needed
<mardestan> well lets get back to queues
<mardestan> one have to be consistent on the two tables, one is in-order table which comes from where the instruction was fetched
<mardestan> the other one is out-of-order table which is read only when no two instructions are issued in sequence
<mardestan> in other words, one uses verilog instances as index
<mardestan> the other one uses simd arbiters wfid number as multiplexed instances index
<MoeIcenowy> anarsoul: looks like BLEND issue
<MoeIcenowy> when rendering the missing text, GL_BLEND is glEnable'd
<mardestan> so hence: i think to put it another way
<mardestan> the fetch order queue, is in column order
<mardestan> and the simd queue is in row order
<mardestan> in instr_info_table.v
<mardestan> column order table controller is used like told any of the two instructions in line ordered queue in sequence
drod has quit [Remote host closed the connection]