jcureton has quit [Remote host closed the connection]
davidlt has joined #panfrost
cwabbott has quit [Quit: cwabbott]
afaerber has quit [Quit: Leaving]
cwabbott has joined #panfrost
buzzmarshall has quit [Quit: Leaving]
buzzmarshall has joined #panfrost
buzzmarshall has quit [Quit: Leaving]
buzzmarshall has joined #panfrost
tlwoerner has quit [Quit: Leaving]
yann has joined #panfrost
jcureton has joined #panfrost
<tomeu>
robher: what's the plan regarding the BO cache in userspace and madvise?
<tomeu>
ah, guess a BO could be marked as MADV_DONTNEED when placed in the cache, and MADV_NORMAL when taken out of it
<alyssa>
wens: Is it..? I thought the issue was us exposing OES_standard_derivatives (an extension) incorrectly
<alyssa>
Or, wait, no, I'm confused with index_bias, mea culpa
<alyssa>
chewitt: Could you try to figure out how to start weston on your board? Having troubles over SSH
pH5 has quit [Quit: bye]
yann has quit [Ping timeout: 268 seconds]
<alyssa>
Never mind -- figured it out :)
<jcureton>
alyssa: i'm pulling Mesa up past 5a7688fdecd7 ("panfrost: Use 64-bit descriptors globally"), and i'm seeing the same thing on two T720 platforms that rtp saw on T628 last week. kmscube gives a grey background but no cube. this is true on an Allwinner H6 (aarch64+T720) as well as an armv7l+T720 SoC I can't disclose. the armv7l+T720 works well prior
<jcureton>
to the 64-bit descriptor change.
<jcureton>
any pointers on where to look? there's a lot of internal dependencies in that patch and all of the manual bit poking appears to only happen if ctx->is_t6xx
<EmilKarlson>
are you people using linux-5.3-rc*, did you notice slowdown in non-panfrost graphics
<EmilKarlson>
whatever designware controller
<alyssa>
EmilKarlson: We've received reports that Panfrost gfx is slower in 5.3 than 5.2, but we were unable to reproduce this. If this applies to non-Panfrost as well, that'd be good to know..
<jcureton>
i do have the issue on 32 and 64 bit. i'm actually building right now with something comparable to that and will report back!
<EmilKarlson>
alyssa: any tips on how one could profile this?
<alyssa>
EmilKarlson: Oh, I'm not great with profilers..
<alyssa>
jcureton: So, T720 support is still new to the tree, to preface this
<EmilKarlson>
other option would be bisect I guess, but everything is a bit harder, when you don't have a benchmark with numbers
<alyssa>
EmilKarlson: Try glmark2-es2-drm as a benchmark, and if that doesn't repo, glmark2-es2-wayland under weston
<EmilKarlson>
does it specifically measure the kms part?
<EmilKarlson>
glmark sounds a bit 3d'y
<alyssa>
jcureton: It's possible some magic bits marked as T6xx are also needed on T720, hence why rtp was able to get it working on T6xx
<alyssa>
EmilKarlson: Oh, no, glmark is specifically 3D..
<alyssa>
jcureton: rtp sent in 397f9ba69fcaef17de5c8f639957743890fa7805 to fix T6xx after that commit, not sure if you cherrypicked that into your tree / if it's active (is_t6xx may not be set on T720 depending if you have /83a1d5544a78b6f741523aa1689ab0c0941d549b in your tree as rtp linked)
<jcureton>
alyssa: yeah, reverting 83a1d5544a78 where you change is_t6xx fixes 32-bit
<jcureton>
testing on 64 now
calcprogrammer1 has joined #panfrost
<alyssa>
Alright, that's good to know
<jcureton>
also fixes 64
<alyssa>
jcureton:
<alyssa>
*curses tab complete, didn't mean to ping8
<alyssa>
Hm, so we have a few options then
<alyssa>
1) Change is_t6xx to is_t72x_or_earlier (or something) and be satisfied with the pile of hacks
<alyssa>
2) Figure out specifically which is_t6xx-only changes are needed on T720 (it may be all or just a subset)
<alyssa>
3) Something else?
<jcureton>
i can experiment on (2) and see which ones appear to be explicitly needed on T720
<alyssa>
Go for it! :)
raster has quit [Read error: Connection reset by peer]
<jcureton>
alyssa: just realized that the only magic bit that needs set is the one Arnaud fixed in 397f9ba69fca.. should have been obvious at the outset since that's what I built with :)
<rtp>
I'm not sure that the is_t6xx is used somewhere else nowadays
<alyssa>
jcureton: \o/
<alyssa>
rtp: I suppose it isn't. Huh, right, a bunch of old is_t6xx properties turned out to be 32-bit hacks
<alyssa>
So the 64-bit descriptors removed all that
<alyssa>
I'd love to know what rtp's magic bit does
<rtp>
I'd like to know that too :)
<alyssa>
Maybe a chicken bit for some errata...?
<alyssa>
All in due time, I suppose! :)
<alyssa>
tomeu: daniels: Re your questions about SSBOs, they're handled identically to __global buffers in OpenCL
<alyssa>
So all the CL related re applies directly here
<alyssa>
So conceptaully, allocate the SSBO as just any old BO
<alyssa>
Upload the GPU side address as a uniform (sysval) and pass it to the shader
<alyssa>
In the shader, use generic load/store ops with a direct address and move in that address from the uniform (with uint64 ops)
<alyssa>
Stores to SSBOs are slightly more complex than the CL case
<alyssa>
HdkR: The test in question is doing derivatives in a loop.
<HdkR>
Fun :D
<alyssa>
The skip/kill flags Connor identified (cont/last in Panfrost)
<alyssa>
If we set the kill flag (so cyberpunk), helper thread goes poof
<alyssa>
So derivatives and wrong for iterations > 1
<HdkR>
Makes sense
<calcprogrammer1>
I tried to run OpenJK (open source Jedi Academy, Quake 3 based engine) and I can move around in spectate mode in an empty map but as soon as a player/bot joins it crashes and spams PIPE_FORMAT_R16G16B16A16_UNORM. Just wanted to let you know if you haven't tested this game, was just experimenting with the latest build of Mesa.
<HdkR>
Time to take control flow in to account while determining if kill should be used :P
<alyssa>
calcprogrammer1: Hm, I haven't tested that game
<alyssa>
RGBA16_UNORM should theoretically be supported.. wonder what it's doing with the format (vertex? texture? render target?)
<alyssa>
calcprogrammer1: Could you grab a backtrace, please? :)
<calcprogrammer1>
mind walking me through how to do that?
<alyssa>
$ gdb [program]
<alyssa>
r
<calcprogrammer1>
going to try openarena as well since it's a similar engine
<alyssa>
(do whatever to make it crash)
<alyssa>
bt
<alyssa>
(Paste output)
<calcprogrammer1>
ok
<HdkR>
Looks like it is using texrect + rgba16
<alyssa>
HdkR: Added a dumb heuristic of "always keep helpers alive if loops are used", problem solved .. :P
<HdkR>
alyssa: pfft
<HdkR>
How does that behave in a world with SSBOs and image stores? :P
<alyssa>
HdkR: As I discovered htis morning, SSBO stores are wrapped in "if(!gl_HelperInvocation) { ... }"
<alyssa>
I would assume image stores work the same way
<HdkR>
Ah, you wrap stores in a helper invocation check yourself?
<alyssa>
HdkR: Well, I haven't implemented SSBOs or images yet, but yes, that's required on mdg
<HdkR>
:D
<alyssa>
Oh, and then there's this weirdo unknown4 flag
<alyssa>
"unknown4 = 0x1" when the results of texturing are fed back into derivatives, I guess
<alyssa>
(Set on the derivative, I mean)
<HdkR>
flag on texture op?
<alyssa>
Yeah
<alyssa>
(Derivatives are special texture ops)
<HdkR>
Does behaviour change when it is set to zero?