#lima on 2019-09-26 — irc logs at freenode.irclog.whitequark.org

2019-07-03 10:24 ChanServ changed the topic of #lima to: Development channel for open source lima driver for ARM Mali4** GPUs - Kernel has landed in mainline, userspace driver is part of mesa - Logs at https://people.freedesktop.org/~cbrill/dri-log/index.php?channel=lima and https://freenode.irclog.whitequark.org/lima - Contact ARM for binary driver support!

00:15 <anarsoul> MoeIcenowy: I may have found how to support rect textures

00:15 <anarsoul> see lima_pack_reload_plbu_cmd()

00:16 <anarsoul> td->unknown_1_1 = 0x80;

00:17 <anarsoul> and then see reload_varying() - this is actual FB size, not [0-1]

00:17 <anarsoul> so my guess is that this bit enables rect textures

00:19 <anarsoul> or it's unknown_2_2 which is set only here

01:03 <MoeIcenowy> anarsoul: 1_1 is the lowest bits of td, right?

01:03 <MoeIcenowy> of the 1st word of td

01:14 yuq825 has joined #lima

01:17 <MoeIcenowy> anarsoul: add a hack for GNOME to allow 1x1 RECT texture first?

01:39 kaspter has quit [Ping timeout: 265 seconds]

01:39 camus has joined #lima

01:41 camus is now known as kaspter

02:13 <MoeIcenowy> anarsoul: did a simple test -- comment unknown_2_2 does not hurt, but comment unknown_1_1 breaks reloading (testing with weston w/o partial damage)

02:13 <MoeIcenowy> the whole background become the same color when unknown_1_1 is commented...

02:13 <MoeIcenowy> looks right?

02:15 <MoeIcenowy> anarsoul: is there any test program for proper rect texture/

02:21 <MoeIcenowy> anarsoul: BTW interestingly Allwinner seems to forget to strip symbols from libMali

02:36 jrmuizel has quit [Remote host closed the connection]

03:06 <MoeIcenowy> anarsoul: IT WORKS!!!!!!!!!!!!!!!!!!!!!!!

03:06 <MoeIcenowy> the 1_1 0x80 bit is really for rect texture

03:17 <yuq825> \o/

03:17 <anarsoul|c> Great

03:17 <MoeIcenowy> yuq825: does your gfx repo accept PR?

03:17 <yuq825> fine

03:17 <MoeIcenowy> I'm going to PR the rect texture test program

03:17 <yuq825> ok

03:18 <anarsoul|c> Are you going to make MR for mesa as well?

03:18 <MoeIcenowy> anarsoul: already sent

03:18 <MoeIcenowy> !2131

03:18 <anarsoul|c> I'll check it later tonight

03:22 <MoeIcenowy> anarsoul: btw what's the S and T in texture?

03:23 <MoeIcenowy> are they axis inside the texture?

03:23 <anarsoul> MoeIcenowy: I think you should check "sampler->base.normalized_coords" instead

03:23 <anarsoul> and set rect bit if its false

03:23 <anarsoul> that's what other drivers do

03:23 <MoeIcenowy> oh thanks for this tip

03:29 <MoeIcenowy> oops it seems to be not working?!

03:29 <MoeIcenowy> oops it should be false when rect

03:29 <anarsoul|c> normalized coords is true for regular textures

03:30 <anarsoul|c> False for rect

03:31 <MoeIcenowy> okay, now the next target is 3D texture...

03:31 <MoeIcenowy> but I need to learn what it is first

03:31 <anarsoul|c> That's gonna be tough one

03:32 <MoeIcenowy> learning OpenGL by hacking into the driver sounds... ridiculous

03:32 <anarsoul|c> Ha, that's the most fun way

03:34 <anarsoul|c> Btw, mipmapping is broken for linear textures

03:34 <anarsoul|c> Noticeable in q3a

03:34 <anarsoul|c> But it works fine for tiled

03:36 <anarsoul|c> I thought that it's wrong filtering settings, but now I think that we use wrong size for mipmap levels

03:36 <anarsoul|c> Since even with filtering disabled there's no such artifacts as with linear textures

03:37 <MoeIcenowy> BTW is rectangle texture a GL-only feature?

03:38 <anarsoul|c> I think there's extension for gles

03:43 <MoeIcenowy> looks like we implemented a feature that is not supported in blob

03:44 mardikene193 has joined #lima

03:45 <mardikene193> it much appears that sm4.0 first introduced register indirect addressing of temporaries for pixel shader.

03:47 <mardikene193> as 16 texture units and 4texture mapping units are for specialized 4pixel shader hw like r300, than the work is slightly more complex on such hw, adjusting offsets of the clamping unit, i would not know how to optimize vertex shaders since it does not use textures.

03:47 <anarsoul|c> Nope

03:48 <anarsoul|c> Qiang dumped reload from blob

03:48 dddddd has quit [Remote host closed the connection]

03:48 <anarsoul|c> So blob definitely does it, but probably they're not exposing it for some reason?

03:52 <yuq825> mipmap break? I remember I fixed it before and tested with glmark texture mipmap option

03:53 <anarsoul|c> Try q3a

03:53 <anarsoul|c> With opengl1 rendering

03:53 <anarsoul|c> I can show screenshot later tonight

03:54 <mardikene193> this might still be possible in vertex shader though afterall, since they allow constant indexing perhaps for uniforms also

03:54 <yuq825> I didn't tried q3a, but only glmark. If glmark does not work now, then must be a regression

03:54 <mardikene193> in other words, instead of texture fetches , uniform fetches can be done

03:55 <mardikene193> and instead of clamping the offsets, one can do indirections

03:59 <anarsoul|c> I'm not sure how to check it with glmark

04:02 <anarsoul|c> I.e. I'm not sure how it should look

04:02 <MoeIcenowy> anarsoul: blob doesn't expose EXT_texture_rectangle

04:04 <MoeIcenowy> oh BTW GLES2 really doesn't have it

04:14 megi has quit [Ping timeout: 245 seconds]

04:17 mardikene193 has quit [Quit: Leaving]

04:25 <anarsoul|c> I see

04:48 <anarsoul|c> broken mipmap https://usercontent.irccloud-cdn.com/file/Jz1GzxwY/IMG_20190925_214811.jpg

04:49 <anarsoul|c> tiled textures, working mipmap https://usercontent.irccloud-cdn.com/file/qhTLizWN/IMG_20190925_214723.jpg

04:49 <anarsoul> yuq825: ^^

04:52 <anarsoul> it looks like one of levels is really off

04:52 <yuq825> glmark2-es2 -b texture:texture-filter=mipmap

04:53 <anarsoul> yuq825: so far q3a is the only way I know to reproduce the issue

04:54 <anarsoul> yuq825: maybe transfer is wrong?

04:55 <yuq825> no idea...

04:56 <anarsoul> see comment in setup_miptree()

04:56 <anarsoul> oh

04:57 <MoeIcenowy> anarsoul: is R the 3rd dimension of a 3D texture?

04:58 <anarsoul> MoeIcenowy: yes

04:58 <MoeIcenowy> how is it applied to a surface?

04:59 <anarsoul> it's 3d array, not 2d array now, so when you sample you have to supply 3 coordinates

04:59 <MoeIcenowy> how do I define the 3rd coordinate?

04:59 <MoeIcenowy> a surface itself is 2D

05:00 <anarsoul> texture is not a surface anymore

05:00 <anarsoul> it's volume :)

05:00 <MoeIcenowy> oh I just need to pass a vec3 in texture3D() ?

05:00 <anarsoul> yes

05:01 <anarsoul> MoeIcenowy: I have a feeling that one of the bits near texture_rect and texture_2d is responsible for enabling texture_3d

05:01 <anarsoul> but there's another issue

05:01 <anarsoul> we don't know format for sampler3D instruction

05:02 <MoeIcenowy> will it get interpolation on the z component of the texture3D function?

05:02 <MoeIcenowy> for example, when z=0 is full black and z=1 is full white, will z=0.5 make a gray?

05:02 <anarsoul> yuq825: setup_miptree() should probably use offset 0x400 and 0x800 for last 2 levels, not only for level 11 and level 12

05:03 <MoeIcenowy> anarsoul: any offline compiler doesn't support sampler3D?

05:03 <anarsoul> nope :(

05:03 <anarsoul> tried that

05:04 <anarsoul> it just crashes if you try to enable extension for 3D textures and use it in the shader

05:04 <MoeIcenowy> BTW I remember offline compiler is available for download for public, right?

05:04 <anarsoul> yes

05:04 <anarsoul> and you can disassemble mbs files with lima now, there's a standalone disassembler tool

05:05 <MoeIcenowy> oops there's a so far distance between 2d and cube

05:05 <anarsoul> note that cubemap is not 3D texture

05:05 <anarsoul> cubemap is just 6 faces of cube

05:05 <anarsoul> 3D texture is actually 3D texture

05:06 <MoeIcenowy> BTW do we support 1D texture well now?

05:06 <anarsoul> no :)

05:06 <anarsoul> we can emulate it I guess

05:07 <MoeIcenowy> I think we hardcoded 2D somewhere

05:07 <anarsoul> unless you figure out descriptor bit to mark 1D textures

05:07 <anarsoul> (and probably sampler instruction format however I'm not sure here)

05:07 <MoeIcenowy> just before where I set rect

05:08 <MoeIcenowy> anarsoul: however, if 2D sampler instruction is 0x00, is it possible that 1D/2D/3D share the same sampler instruction?

05:08 <anarsoul> try it?

05:08 <anarsoul> I have no idea :)

05:09 <anarsoul> I just accidentally noticed that with reload we didn't set this bit in texture descriptor anywhere

05:09 <anarsoul> and that varyings had non-normalized coordinates

05:10 <MoeIcenowy> BTW if it really supports 3D texture

05:11 <MoeIcenowy> the 3 bits after S/T clamping configuration might be R clamping

05:11 <anarsoul> alyssa suggested that unknown_3_1 can be depth

05:11 <anarsoul> it goes right after height

05:12 Barada has joined #lima

05:17 <anarsoul> MoeIcenowy: btw, can you check whether setting should_tile to true at the beginning of _lima_resource_create_with_modifiers() fixes ppmmu fault for you?

05:58 <anarsoul> MoeIcenowy: btw, you can also try clearing texture_2d bit and check what it does

07:14 adjtm has quit [Ping timeout: 265 seconds]

07:38 <MoeIcenowy> anarsoul: setting should_tile doesn't fix the pp mmu fault

07:40 <MoeIcenowy> anarsoul: shouldn't the depth be 13bit?

07:40 jailbox has quit [Ping timeout: 276 seconds]

07:44 <MoeIcenowy> anarsoul: by unsetting texture_2d, we get texture_1d

07:46 <MoeIcenowy> the y of the texture2D() input seems to be just ignored

07:53 adjtm has joined #lima

07:54 <MoeIcenowy> this may indicates we can get the 3rd dimension?

08:01 jailbox has joined #lima

08:49 hellsenberg has quit [Quit: CPU triple-faulted.]

09:13 hellsenberg has joined #lima

10:04 <MoeIcenowy> 3rd dimension seems to be ignored...

10:15 yuq825 has quit [Quit: Leaving.]

10:39 Da_Coynul has joined #lima

10:42 <MoeIcenowy> anarsoul: I think maybe Mali-400 doesn't support 3D texture...

10:42 <MoeIcenowy> even VC4 doesn't support it

10:42 Da_Coynul has quit [Client Quit]

10:43 <MoeIcenowy> BTW the offline compiler only crashes for Mali-400 on sampler3D, it performs well for Midgard/Bifrost

10:48 dddddd has joined #lima

10:56 <rellla> MoeIcenowy: http://imkreisrum.de/piglit/mali450/0317527..3831bb5-lima-next/fixes.html :)

10:56 <rellla> with cubemap and 2drect

10:56 <MoeIcenowy> rellla: where's the cubemap patchset?

10:57 <rellla> https://gitlab.freedesktop.org/arnomessiaen/mesa/commits/lima-cubemaps

10:57 <rellla> no MR yet

10:57 <rellla> i did a comment on your MR about the word1 bits ...

11:03 <rellla> it's run with that https://gitlab.freedesktop.org/rellla/piglit/commits/gles piglit setup btw, which contains a hack to increase tolerance

11:04 <Tofe> rellla: interesting, as the cubemap is currently needed by the QtWebEngine component

11:13 megi has joined #lima

11:24 jrmuizel has joined #lima

11:29 jrmuizel has quit [Ping timeout: 276 seconds]

11:44 adjtm has quit [Ping timeout: 240 seconds]

12:00 yuq825 has joined #lima

12:35 adjtm has joined #lima

12:36 jrmuizel has joined #lima

12:39 mardikene193 has joined #lima

12:40 <mardikene193> The major difference between solutions that have register indirect addressing and the one that does not is:

12:41 <mardikene193> when clamping to schedule register based duplets or single instructions based of the results of in-line loads

12:42 <mardikene193> you gotta make writebacks and readbacks manually, since scoreboard on SIMD and bypass networks on VLIW obviously no longer can do that

12:42 <mardikene193> this complicates quite a lot of stuff, but still can be done.

12:43 Barada has quit [Remote host closed the connection]

12:44 Barada has joined #lima

12:46 <mardikene193> So actually that such code has not been materialized for Mali neither freedreno and r300 yet is totally undertandable, i could not make it back times, noone else could, however after i have studied everything we can still attempt it!

12:47 <mardikene193> this is pretty difficult code, both to explain and make, and hence if you can not explain it to anyone therefor it is also very difficult to maintain.

13:02 niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]

13:02 yuq825 has quit [Ping timeout: 240 seconds]

13:02 niceplace has joined #lima

13:08 yuq825 has joined #lima

13:09 deesix has quit [Ping timeout: 240 seconds]

13:10 dddddd has quit [Ping timeout: 245 seconds]

13:10 yuq825 has quit [Client Quit]

13:11 dddddd has joined #lima

13:11 deesix has joined #lima

13:23 Elpaulo has joined #lima

13:36 jrmuizel has quit [Remote host closed the connection]

13:37 jrmuizel has joined #lima

13:40 yuq825 has joined #lima

13:47 jrmuizel has quit [Remote host closed the connection]

13:51 <MoeIcenowy> anarsoul: for the PP MMU fault, I commented out the code in the shader that calculates ratio and multiply to gl_FragColor, then it works

13:59 Barada has quit [Quit: Barada]

14:02 yuq825 has quit [Ping timeout: 246 seconds]

14:03 <mardikene193> The calculation for mali 400mp goes either like this, you have 128/4/4 = 8 bundles, or 128/4=32 bundles depending whether they include the vector length as separate stage

14:03 <mardikene193> now bundles needs to be multiplied with 2 to get threadgroup count

14:04 <mardikene193> since vliw works as with dual scheduler

14:06 <mardikene193> so the arbiter is probably according to the first calculation 64*64 would be too big

14:07 yuq825 has joined #lima

14:11 <mardikene193> 16*16=128 queue entries according to calculations, which makes it equal to 2pixel shaders r300 on only single core, wau

14:11 <mardikene193> something must be wrong, it can not be so powerful

14:15 <mardikene193> yeah well 16*16=128/2 actually is 64*4 is 256

14:15 <mardikene193> two core version should accommodate 512 entries and should be compatible with sm3.0 spec

14:20 <mardikene193> that means mali200 and mali300 utgard gpus can not be sm3.0

14:28 <mardikene193> minor peek to forgatten info ..need to look at multi2sim how is the arbitration done, was it 8x8 in patches of four instructions fundementally or 16*16 within queues

14:33 <MoeIcenowy> anarsoul: found the problem of the pp mmu fault

14:33 <MoeIcenowy> the texture list needs to be zeroed

14:33 jrmuizel has joined #lima

14:33 <MoeIcenowy> otherwise old data inside it might be recognized as VA

14:34 <anarsoul> MoeIcenowy: I see, send an MR?

14:34 <MoeIcenowy> oh I'm not sure now

14:37 <anarsoul> I'm not sure whether Mali4x0 supports 3D textures, I guess we'll never know

14:37 <MoeIcenowy> oh I think it's not the reason now...

14:37 <anarsoul> at least we have rect and 1d :)

14:37 <MoeIcenowy> anarsoul: fake 3D textures like vc4?

14:38 <cwabbott> I heard through the grapevine that it does support them, it was just never turned on in the driver

14:39 <cwabbott> who knows what the magic combination of descriptor bits + layout + shader bits is though

14:39 <MoeIcenowy> oh the old data in texture list seems to be not the reason

14:40 <anarsoul> cwabbott: that's unfortunate :(

14:41 yuq825 has quit [Quit: Leaving.]

14:44 <enunes> MoeIcenowy: yeah I mentioned this garbage data in the texture descriptor a couple days ago, it doesn't seem to cause any bug though

14:46 <anarsoul> MoeIcenowy: please do s/abnorm_coords/unnorm_coords and remove braces for a single line block

14:49 <MoeIcenowy> anarsoul: BTW, is v[0] equal to v.x when v is a vec2?

14:49 <anarsoul> yes

15:00 <MoeIcenowy> render->uniforms_address |= ((ctx->buffer_state[lima_ctx_buff_pp_uniform].size) / 4 - 1);

15:00 <MoeIcenowy> what's the use of this?

15:01 <anarsoul> sets size of uniforms?

15:03 <anarsoul> -1 does look suspicious though

15:05 <MoeIcenowy> however... isn't it offseting the uniforms_address?

15:07 <anarsoul> no

15:07 <anarsoul> lower bits of this register is uniform address

15:07 <anarsoul> likely lower 6 bits

15:07 <anarsoul> everything seems to be aligned to 64 bytes on Mali4x0

15:08 <MoeIcenowy> is uniform count?

15:09 <anarsoul> ?

15:09 <MoeIcenowy> do you mean that the lowest 6 bits are reused for uniform count?

15:10 <anarsoul> yes

15:11 <anarsoul> didn't you notice that they use every single bit pretty much everywhere? :)

15:12 <MoeIcenowy> BTW the code seems to be directly sourced from limare

15:12 <MoeIcenowy> render->uniforms_address |=(ALIGN(plbu->uniform_size, 4) / 4) - 1;

15:12 <MoeIcenowy> this is what in limare

15:14 <anarsoul> most of command stream generation was taken there

15:15 <MoeIcenowy> anarsoul: the whole pp uniform code looks strange

15:15 <anarsoul> why?

15:15 <MoeIcenowy> why is there a one-item array?

15:15 <MoeIcenowy> the array dumped with "add pp uniform info at va XXXXXXXX" is always only one item, containing the pointer to the real uniform array

15:15 <anarsoul> right

15:15 <anarsoul> ask ARM

15:16 <anarsoul> :)

15:16 <anarsoul> I'm not sure why they need double indirection for uniforms

15:16 <MoeIcenowy> but... as we have no size info for real uniform array length

15:17 <MoeIcenowy> maybe the lowest 6 bit of uniform address should be the length of the uniform address array (which is 1) ?

15:18 <anarsoul> MoeIcenowy: try setting it to 1?

15:19 <MoeIcenowy> ah, 0, because it's subtracted with 1

15:21 <MoeIcenowy> looks like the pp mmu fault is solved, however I don't know whether it affects the render result...

15:22 <mardikene193> yeah i remember now, it was sort of asymmetrical arbiter, 80x32 or 64x32 depending on width of bundle

15:26 <anarsoul> OK, so we know that something's wrong with uniform setup and ppmmu fault was coming from uniform read

15:26 <anarsoul> try making a test with a lot of uniforms, run it on blob and capture what it does?

15:28 <MoeIcenowy> I cannot capture the behavior of the blob now

15:35 mardikene193 has quit [Quit: Leaving]

15:45 <MoeIcenowy> enunes: could you run a piglit test on https://gitlab.freedesktop.org/icenowy/mesa/commits/lima-uniforms-address-size

15:45 <MoeIcenowy> and compare with mesa master to check regressions?

15:45 <MoeIcenowy> thanks

15:50 megi has quit [Ping timeout: 265 seconds]

15:58 deesix has quit [Ping timeout: 245 seconds]

15:59 dddddd has quit [Ping timeout: 245 seconds]

16:00 deesix has joined #lima

16:00 dddddd has joined #lima

16:18 <MoeIcenowy> anarsoul: could I add your R-b after changing these?

16:22 dddddd has quit [Ping timeout: 276 seconds]

16:22 deesix has quit [Ping timeout: 265 seconds]

16:23 <anarsoul> yes

16:23 deesix has joined #lima

16:33 megi has joined #lima

16:34 dddddd has joined #lima

16:34 adjtm has quit [Ping timeout: 240 seconds]

16:36 <MoeIcenowy> strange... some change in mesa makes the title bar of GNOME control center colorful

16:44 <MoeIcenowy> looks like some glamor-related issue again...

16:48 <anarsoul> :(

16:53 <anarsoul> MoeIcenowy: what about wayland session?

16:53 <anarsoul> does it work?

16:53 <anarsoul> it should have better performance for sure than X11

16:53 <MoeIcenowy> yes, wayland works here

16:54 <anarsoul> OK, so we have issue with PP uniforms

16:54 <MoeIcenowy> BTW faking 3D texture is still necessary for GNOME Shell

16:54 <anarsoul> fortunately it's easy enough to dump

16:54 <MoeIcenowy> although they seem to never really use it in shaders

16:55 <MoeIcenowy> and according to apitrace, it's still some 1x1 strange placeholder

16:55 <MoeIcenowy> what the hell is GNOME doing...

16:55 <anarsoul> MoeIcenowy: if you have a test for sampler3D, try clearing texture_2d in descriptor and setting 1 bit at time in unknown_1_2 and unknown_1_3

16:55 <MoeIcenowy> I tried it

16:56 <anarsoul> also let's assume that unknown_3_1 and part of unknown_3_2 is depth

16:56 <anarsoul> doesn't work?

16:56 <MoeIcenowy> yes, I assumed 13 bits

16:56 <MoeIcenowy> doesn't work.

16:56 <anarsoul> :(

16:56 <MoeIcenowy> ooooops

16:56 <anarsoul> it was such a nice hypothesis

16:56 <MoeIcenowy> I may forgot to set depth

16:57 <MoeIcenowy> however I killed all the test code by a `git reset --hard`

16:57 <anarsoul> I always keep experiments like these in separate branches

16:57 <anarsoul> branches are cheap in git, so why not

17:00 <anarsoul> MoeIcenowy: btw if you commited the code it's still there

17:00 <MoeIcenowy> I didn't commit it

17:00 <anarsoul> you'll just need to do some git archeology to extract it

17:00 <MoeIcenowy> I think reflog is enough

17:00 dddddd has quit [Ping timeout: 240 seconds]

17:01 deesix has quit [Ping timeout: 265 seconds]

17:02 <MoeIcenowy> anarsoul: interesting

17:02 <MoeIcenowy> after setting depth

17:02 <MoeIcenowy> thing changed

17:02 adjtm has joined #lima

17:03 <anarsoul> so sampler3D works?

17:03 deesix has joined #lima

17:03 <anarsoul> I guess you can use texture with depth 2, and just use one image for 1st layer and another for 2nd

17:03 <MoeIcenowy> I used a 1x1 depth 3 texture

17:04 <MoeIcenowy> layer 1 is red, 2 is green and 3 is blue

17:04 <anarsoul> that'd also work

17:04 <MoeIcenowy> however the result is still not the same with the one on my PC

17:05 <MoeIcenowy> maybe texture descriptor needs tweak

17:05 <anarsoul> try setting different bits for marking it as 3d

17:05 <anarsoul> and try it with texture_2d set and cleared

17:08 <MoeIcenowy> anarsoul: can texture3D be mipmapped?

17:13 dddddd has joined #lima

17:20 <MoeIcenowy> anarsoul: now the problem is that we totally don't know how should the remaining layers are placed

17:25 <MoeIcenowy> at least now the r coordinate starts to be honored

17:37 gaulishcoin has joined #lima

17:44 mardikene193 has joined #lima

17:45 <mardikene193> OK so let's reach to the point, what is needed for you is to chill out a bit, and start thinking, this will be something that you would not regret.

17:46 <mardikene193> you know technically there are not many options to call 32warps on 16bundles which include 4-5 words...

17:47 <mardikene193> at least as the final result should be a constant output which is similar to any of the methods

17:48 <mardikene193> but instead of placing a single instruction into the queue, it reorders and into the single instruction spot it adds 4-5 and fetches also 4-5

17:48 <mardikene193> from queues

17:53 <anarsoul> MoeIcenowy: sure it can :)

17:53 <anarsoul> for remaining layers you'll have to experiment

17:53 <anarsoul> and yeah, we need to implement explicit LOD to debug mipmapping

17:54 <anarsoul> MoeIcenowy: what bit enabled 3d textures?

17:54 <anarsoul> please commit it and push it somewhere so you work isn't lost

17:54 <anarsoul> MoeIcenowy: btw you also need to consider tiling :)

17:55 <anarsoul> however I'd expect it just to work if you align it to 16x16 (and probably one more x16?) boundaries

17:58 <mardikene193> hmm, well sorta maybe yeah. Actually, tiling is calculated based of the cacheline size

17:59 <mardikene193> it also requires the memory to be physically contiguous

17:59 <anarsoul> MoeIcenowy: so after all it's the same instruction for sampler for 1d, 2d and 3d cases?

17:59 <mardikene193> For some reason you have chosen the most complex subject to work on, it requires the most changes too

18:01 <mardikene193> I thought i wanted to deal more like with other things, but this is inherently big allocators property relying issue

18:02 <mardikene193> this is very complex because, the more complex the allocator goes, the more delay it adds on CPU

18:02 armessia has joined #lima

18:04 <mardikene193> hence on the first subject i talked about, the queues are the easiest to be implemented as 32*32 arbiter, which quits earlier when program ends that is

18:05 <mardikene193> it quits then when program ends, rather logically more correct way to put it

18:08 drod has joined #lima

18:09 niceplace has quit [Quit: ZNC 1.7.3 - https://znc.in]

18:09 niceplace has joined #lima

18:25 <mardikene193> technically it starts to (And this is my opinion) restart the instruction queue sooner, but ends when the program ends, i.e it starts to replace queue entries sooner, and very technically this is acheived as:

18:26 <mardikene193> in verilog as in the port connection was bigger size of the vector

18:26 <mardikene193> than that of it's nested instance

18:29 <mardikene193> so it should be ulimately quit in 24th instance and start to wrap around

18:35 <mardikene193> i do not think i myself understand that kind of arbiter :D:D

18:35 drod has quit [Read error: Connection reset by peer]

18:52 drod has joined #lima

18:54 <rellla> armessia: imho it'd be worth to make a MR with the cubemaps branch... at least a WIP one...

18:57 <armessia> rellla: I'm still working on it, the different faces aren't properly aligned yet in all cases

18:57 <armessia> rellla: but making a WIP MR already won't hurt indeed, in this way some more feedback can come in early

18:58 <armessia> rellla: will create a WIP MR soon

18:58 <rellla> now that rect textures have landed and 3d is probably worked on ;)

18:58 <rellla> fine

19:01 <mardikene193> Latest Midgard GPU is T880. ▫ Maximum of 16 shader cores. ▫ Tile size 16x16 (4x4-32x32 internally).

19:02 <mardikene193> I absolutely do not understand what i am looking at.

19:02 <mardikene193> does that mean it is configurable from 4x4-32x32

19:02 <mardikene193> are those threads?

19:05 <armessia> rellla: things are moving fast on the texturing side lately :-)

19:21 <mardikene193> i remember reading some talonmies and stackoverflow articles, Mark Harris's, and also this Robert Crovella's from nvidia.

19:22 <mardikene193> They claimed something that on such gpus like nvidia ones, when CU registers get used, it should raise an interrupt

19:54 <mardikene193> more likely it is that, different contexts can be put to different compute units

20:36 <anarsoul> armessia: just some occasional discoveries :)

20:36 <anarsoul> blob exposes rect textures only for reload

20:36 <anarsoul> and it doesn't expose 3D textures at all, so I'm not convinced that we can make it work yet. MoeIcenowy is working on it though

20:40 <anarsoul> rellla: enunes: can you review https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2080 ?

20:41 <anarsoul> that's fairly simple one, it just replaces "vec4 ssa1 = load_input(varying); vec2 ssa2 = ssa1.xy; vec2 ssa3 = ssa1.zw" with "vec2 = load_input(varying.xy); vec2 = load_input(varying.zw)"

20:45 drod has quit [Ping timeout: 265 seconds]

20:58 drod has joined #lima

21:01 <armessia> anarsoul: some nice achievements for sure!

21:01 <armessia> we already have one feature more than the blob with rect textures, can as well have two :-)

21:50 jrmuizel has quit [Remote host closed the connection]

22:08 jrmuizel has joined #lima

22:14 jrmuizel has quit [Remote host closed the connection]

22:20 gaulishcoin has quit [Ping timeout: 240 seconds]

22:31 drod has quit [Remote host closed the connection]

22:39 Da_Coynul has joined #lima

22:45 Da_Coynul has quit [Quit: My MacBook Air has gone to sleep. ZZZzzz…]

22:46 Da_Coynul has joined #lima

23:22 <mardikene193> yeah sure, well wrapping around the 32x32 arbiter properly is pretty easy, totally unsure when and how it starts to use multiple COREs