ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
warpme__ has quit [Quit: Connection closed for inactivity]
jrmuizel has quit [Remote host closed the connection]
drod has quit [Remote host closed the connection]
megi has quit [Ping timeout: 265 seconds]
dddddd has quit [Remote host closed the connection]
_whitelogger has joined #lima
_whitelogger has joined #lima
_whitelogger has joined #lima
_whitelogger has joined #lima
_whitelogger has joined #lima
piggz has joined #lima
Elpaulo1 has joined #lima
Elpaulo has quit [Ping timeout: 240 seconds]
Elpaulo1 is now known as Elpaulo
yuq825 has joined #lima
Elpaulo has quit [Ping timeout: 246 seconds]
piggz has quit [Ping timeout: 240 seconds]
<rellla> anarsoul: oh yes. i missed the gitlab issue ...
piggz has joined #lima
piggz has quit [Ping timeout: 240 seconds]
piggz has joined #lima
mardikene193 has joined #lima
<mardikene193> now the dependency resolution is done manually by the compiler in both utgard to midgard but also on later bifrost GPUs, on AMD they are managed for memory operations with s_waitcnt and lgkm counters, which act as barriers/fences kind of for memory in the shader.
<mardikene193> it appears scoreboard blocking the source operands and destinations only when the write is performed, it delays the blocking so, unless fences like those are used on AMD
<mardikene193> however if scoreboard feed happens manually like in MALI case, you can do however you want, blocking any of the operands as needed or not
<mardikene193> MALI gpus are actually all quite well designed, I am reasonably suprised in a good way.
<mardikene193> my friend has one brand new flagship smartphone which has some g7x gpu
<mardikene193> and he showed me the raw performance without much of the optimizations i have suggested on it -- but binary driver was in use
<mardikene193> it seemed to work very well somehow
piggz has quit [Ping timeout: 268 seconds]
<mardikene193> So yeah AMD has the worst design
<mardikene193> on GCN, vliws are fine
<mardikene193> it is because the barrier does not have or consume issue queue entries, they need to be decoded i.e branched to
<mardikene193> however they offer also setgreg which bypasses scoreboard alltogether
piggz has joined #lima
<mardikene193> but as GCN is sm5.0 GPU they would use different code to do indirections
<mardikene193> instead of what i talked for vliws anyways, so...
<mardikene193> so in my opinion SIMD is just inherently bad arch.
<mardikene193> there is a way to index the destination register out-of-bounds only, so sm2.0-sm3.0 true SIMD gpus are the ones that can not do the needed in any way.
<mardikene193> even though SIMD gpus are all unified architectures, constant indexing from out-of-bounds returns zeros
<mardikene193> now that i think about probably there is a way , it's so perverse, you'd have to consume at least 2.5 warp lines to do LSU ops and redirect to lower rows/lines
<mardikene193> it is because texture operations on a second line can't run unless they are ready, they will be ready if offset is adjusted again only
Danct12_ has joined #lima
romainmahoux[m] has quit [Remote host closed the connection]
z3ntu has quit [Write error: Connection reset by peer]
Danct12 has quit [Read error: Connection reset by peer]
bshah|matrix has quit [Remote host closed the connection]
danqo has quit [Read error: Connection reset by peer]
dllud has quit [Ping timeout: 240 seconds]
dllud has joined #lima
bshah|matrix has joined #lima
<mardikene193> So it'd be more sane to make changes to the driver to allow more ALU instructions, if less LSU/memory operations are in use, for instance 64 vs 32 respecitvely on sm2.0 -- if i were to use only 16 or 17 memory laods then 80 or 79 alus could be used! Hw should allow this.
<mardikene193> why limitations exists is probably cause that time cache sizes were not that big, and fallback to memory due to pretty big transistors and hence bigger FO4 paratemets or and gate delays would had made the pipeline far too slow
<mardikene193> however they should be fine trading LSU/mem loads in favor of more ALU ops.
Danct12_ has quit [Remote host closed the connection]
jrmuizel has joined #lima
<mardikene193> however yeah well mali 400mp one core version should come with 256 (which is out of any spec), and two core version allready 5215 for shared mem/LSU and alu usage.
<mardikene193> *512
mardikene193 has quit [Quit: Leaving]
jrmuizel_ has joined #lima
jrmuizel has quit [Ping timeout: 264 seconds]
romainmahoux[m] has joined #lima
Danct12 has joined #lima
z3ntu has joined #lima
danqo has joined #lima
jrmuizel_ has quit [Remote host closed the connection]
megi has joined #lima
jrmuizel has joined #lima
dllud has quit [Ping timeout: 250 seconds]
dllud has joined #lima
dddddd has joined #lima
jrmuizel has quit [Remote host closed the connection]
drod has joined #lima
yuq825 has quit [Quit: Leaving.]
MartijnBraam has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
warpme__ has joined #lima
piggz has quit [Quit: Konversation terminated!]
piggz has joined #lima
piggz has quit [Ping timeout: 276 seconds]
jbrown has quit [Ping timeout: 252 seconds]
piggz has joined #lima
jbrown has joined #lima
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
piggz has quit [Quit: Konversation terminated!]
jbrown has quit [Ping timeout: 276 seconds]
jbrown has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
Wizzup has quit [Ping timeout: 240 seconds]
Wizzup has joined #lima
jrmuizel has joined #lima
drod has quit [Ping timeout: 276 seconds]
drod has joined #lima
jrmuizel has quit [Remote host closed the connection]
drod has quit [Remote host closed the connection]