ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at and - Contact ARM for binary driver support!
_whitelogger has joined #lima
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
Da_Coynul has joined #lima
yuq825 has joined #lima
embed-3d has quit [Ping timeout: 258 seconds]
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
jrmuizel has quit [Remote host closed the connection]
<wens> I wonder how different utgard is from midgard in terms of control flow stuff
<anarsoul> similar, but compilers are totally different :)
<anarsoul> perf doesn't work for me on pine64 :\
<anarsoul> it reports no events
<anarsoul> enunes: looks like pmu is broken in A64, so I have no profiler :(
dddddd has quit [Remote host closed the connection]
<bshah|matrix> anarsoul: got a chance to look at apitrace I had linked?
<anarsoul> bshah|matrix: I'm not sure where you linked it
<anarsoul> :)
<anarsoul> definitely not here
<bshah|matrix> Here? :P
<anarsoul> does replay work for you?
<bshah|matrix> Yes it does.. should have roughly 7 frames
<bshah|matrix> (it doesn't replay on Ubuntu machine for me due to apitrace version there being old)
<bshah|matrix> But works on arch
<anarsoul> 4 @0 eglGetPlatformDisplayEXT(platform = EGL_PLATFORM_WAYLAND_KHR, native_display = 0xffff8a3eaaa0, attrib_list = {}) = 0xaaaad28ba280
<anarsoul> 4: warning: unsupported eglGetPlatformDisplayEXT call
<bshah|matrix> Um weirdness.
<anarsoul> doesn't work on lima either
<anarsoul> error: unable to open display
<bshah|matrix> Is this Wayland session?
<bshah|matrix> (because I have used it on Wayland)
<anarsoul> it's not
<anarsoul> well, it doesn't work if I start it from sway either
<bshah|matrix> Hm quite weird
<bshah|matrix> I'll try capturing another one later today I guess.
Barada has joined #lima
<MoeIcenowy> enunes: got BootAnimation running on Lima after hacking the buffer allocation problem
<bshah> anarsoul: so I just tried on my end, again, and it seems I also get unsuported call bit, but in the end it replays fine..?
<bshah> "Rendered 24 frames in 0.817661 secs, average of 29.352 fps"
<bshah> in either case I uploaded fresh trace at :
<bshah> (same URL)
smaeul has quit [Ping timeout: 276 seconds]
megi has quit [Ping timeout: 246 seconds]
<MoeIcenowy> oh strange
<MoeIcenowy> the gbm buffer allocated w/ Lima cannot get correct modifier
<MoeIcenowy> thus it fails to re-import
megi has joined #lima
adjtm has quit [Ping timeout: 272 seconds]
adjtm has joined #lima
yuq825 has quit [Remote host closed the connection]
Da_Coynul has joined #lima
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
<MoeIcenowy> oh strange... the Gallium screen linked to the GBM instance seems to be softpipe
<MoeIcenowy> not lima
Da_Coynul has joined #lima
_whitelogger has joined #lima
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
monstr has joined #lima
dddddd has joined #lima
chewitt has quit [Quit: Adios!]
<rellla> and select lowering seems to be broken btw ...
<rellla> mov.s0 $0.x, sel.v1 $0 ^const0.xyxy ^const0.yxxy, const0 0.000000 1.000000 0.000000 0.000000
<mmind00> rellla: I'd guess sharing in , similar to that tiling code that is already there
<rellla> mmind00: maybe we should move it to compiler/nir instead ...
afaerber has quit [Quit: Leaving]
monstr has quit [Quit: Leaving]
monstr has joined #lima
<enunes> rellla: yeah it's so generic that I would just propose it to be moved to compiler/nir instead
<rellla> enunes: ok, will prepare a patch. it solves the missing undef handling and fixes some more piglit tests
<rellla> ... the glsl-array-bounds-* ones..
<enunes> rellla: I've seen those, my vectorize patchset also indirectly fixed at least one of them
<enunes> but that pass seems like a good thing to have
<rellla> enunes: i'm not sure, if it's entirely right to make a zero const out of the undef ssa, but all other driver seem to do the same...
<enunes> yeah it would be nice to just not do anything rather than probably creating a mov to a field that is probably useless
<enunes> but it's not even that common and not sure how to solve it otherwise, so I am ok with assigning zero
<rellla> ok.
jrmuizel has joined #lima
<rellla> enunes: should i look into the write_mask/swizzle issue in lower_select (and probably others) ot is this expected to be moved out out ppir lowering?
<enunes> rellla: one thing I'm about to do in the same vectorize patchset is turn selects into scalars because apparently lima can't support the vec4 selects the way nir intends
<enunes> are there write_mask/swizzle issues in other op lowerings?
<enunes> or... potential issues
<rellla> sin/cos is eliminated with the nir MR
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
<rellla> and for the other potential candidates ... i haven't had a look :)
<rellla> "alu->dest.write_mask = u_bit_consecutive(0, num_components);" or "alu->dest.write_mask = 1;" always is suspicious :p
<enunes> always confuses me as wel
<rellla> i just stumbled across it in lower_sin and lower_select. the write_mask was effectively hardcoded to 0001, even if it should write to .y
jrmuizel has quit [Remote host closed the connection]
<rellla> this is a problem in piglit, where one of two consts (1.0, 0.0, 0.0, 1.0) and (0.0, 1.0, 0.0, 1.0) is probed.
yuq825 has joined #lima
<rellla> when sth is wacky with .xyzw we get wrong results. especially when we deal with a select...
Unit193 has quit [Read error: Connection reset by peer]
Unit193 has joined #lima
<enunes> rellla: this is what I'm currently testing before sending the MR for, it creates many more assignments to .y and .z but those still seem to work
buzzmarshall has joined #lima
<rellla> enunes: for the lazy rellla... what does BITSET_SET macro do?
<enunes> we have a bitset called alu_lower which lists the ops that nir_lower_alu_to_scalar will convert to scalar, I'm telling it to also do that for select ops
<rellla> ah ok
<rellla> as it probably fixes some dmesg errors according to your comment, i should give it a quick piglit run :)
<enunes> vectorize affects many tests and optimizes a lot of them, this is the shaderdb report
<enunes> couldnt finish testing yesterday, hopefully today
<rellla> is there anywhere some doc how to create this shaderdb reports?
<enunes> this one with the test names is a bit of a hack, but if you run piglit one time with MESA_SHADER_CAPTURE_PATH to collect all shaders, you can just use them later with shaderdb itself
<enunes> after my MR on it gets merged
<enunes> just the shader names will be like 3-1212.shader_test 3-2203.shader_test etc
jrmuizel has joined #lima
Barada has quit [Quit: Barada]
jonkerj has quit [Quit: brb]
jonkerj has joined #lima
monstr has quit [Remote host closed the connection]
yuq825 has quit [Remote host closed the connection]
<anarsoul> enunes: I profiled mpv and looks like it spends 10% of time in _mesa_format_convert
<anarsoul> I think we need to add support for single-component textures
<enunes> anarsoul: how much work is that?
<enunes> anarsoul: I tried perf top on my pinebook with Fedora as well and I got nothing, but my perf version doesn't match my kernel so I wonder if that's a problem
<anarsoul> enunes: not a lot, I believe it's just a matter of adding corresponding enums
<rellla> fixes/regressions do still look weird within the last ppir instructions ...
<rellla> lower_select still is untouched
<enunes> rellla: I figured that when tests result in 0, 0, 0, 0 and that's an impossible result, it's just some random instability that happens for some reason
<enunes> manually running the test again passes
<anarsoul> rellla: you probably have to get used with select weirdness in assembler :)
<rellla> enunes: manual run passes ...
<rellla> anarsoul: yeah, probably :)
<anarsoul> e.g. mov.s0 $1.x, sel.s1 $0.w ^const0.y ^const0.y, const0 0.000000 1.000000 0.000000 0.000000
<anarsoul> two "const.y" look fishy
<enunes> I waste so much time with those random failures, and even running the same task in a loop never reproduces it if I want
<anarsoul> enunes: running the same task in the loop won't help if you have a read from uninitialized register somewhere
<anarsoul> looks like Mali4x0 preserves reg values
<anarsoul> at least PP
<anarsoul> so if you read from $0 but never write into it you'll be reading a value that's left over of some old shader
<anarsoul> so if you want to reproduce failure run previous shader and shader that failed
<enunes> I tried that too, running full piglit many times, different tests failed at random runs
<anarsoul> btw, I found that helps a lot to catch issues like uninitialized reg
<rellla> however, have to go now. this branch corresponds to the last piglit results. it contains the sin/cos nir commit, my fddxy commit, handling of undef ssa and the vectorize pass - based on bc61253
<enunes> anarsoul: can we use that and turn it into an assert or something so we can fix the unitialized reg to see if that resolves the random failures?
<enunes> or just print some greppable debug
<anarsoul> enunes: it's hard
<anarsoul> and I'm not sure if it's worth the efforts
<MoeIcenowy> strange...
<anarsoul> basically it happens when dest doesn't match src
<anarsoul> in ppir
<MoeIcenowy> EGL_ANDROID_native_fence_sync doesn't work
<MoeIcenowy> even EGL_KHR_fence_sync doesn't work
<enunes> anarsoul: yeah and then, it might also be something in gpir... gpir tests fail at random as well with trivial fragment shader
<anarsoul> :(
<MoeIcenowy> does anyone know how the out_sync of submit work?
<anarsoul> yes, it uses fences
<anarsoul> enunes: you should probably ask Connor to look into it
<anarsoul> I can read GP disassembly, but I barely understand the compiler
<enunes> anarsoul: well since examples with trivial vertex programs also fail, I fear it might be something different like kernel or command stream related
<enunes> I tried valgrind some time ago, enabled it for all shader_runner examples and run many times, some tests still randomly failed, but no valgrind diffs between the randomly failed and passed runs
<MoeIcenowy> anarsoul: how does fences work on Lima?
<MoeIcenowy> I think I got a total mess on fence on Lima when trying on Android
<anarsoul> MoeIcenowy: see kernel driver, it's not really lima-specific
<anarsoul> enunes: I'd check shader first though
<anarsoul> MoeIcenowy: btw, looks like I got rid of RCU stalls
<enunes> maybe disabling no-concurrency and running tests in parallel can reproduce it more easily
<enunes> I should try that
<enunes> with a limited set, and then hopefully be able to bisect it
<anarsoul> enunes: so maybe uninitialized reg read in GP?
<enunes> anarsoul: maybe, but I think it will also happen with the trivial passthrough vertex shader
jrmuizel has quit [Remote host closed the connection]
piggz_ has joined #lima
jrmuizel has joined #lima
Elpaulo has quit [Quit: Elpaulo]
<anarsoul> enunes: then we need a reproducer
piggz_ has quit [Quit: Konversation terminated!]
piggz_ has joined #lima
piggz_ has quit [Ping timeout: 258 seconds]
piggz_ has joined #lima
<anarsoul> enunes: you may want to fix this warning:
<enunes> anarsoul: I wonder why I didnt notice that, will do
Elpaulo has joined #lima
buzzmarshall has quit [Remote host closed the connection]
adjtm has quit [Ping timeout: 268 seconds]
<anarsoul> ouch, mipmapping code is definitely broken
<anarsoul> LIMA_MAX_MIP_LEVELS is 13 and it'll indeed try to attach all the levels in lima_texture_desc_set_res() if they're present
<anarsoul> the problem is that texture descriptor is 64 bytes and last 2 levels won't fit
jrmuizel has quit [Remote host closed the connection]
adjtm has joined #lima
Da_Coynul has joined #lima
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]
jrmuizel has joined #lima
Da_Coynul has joined #lima
Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]