#panfrost on 2020-05-26 — irc logs at freenode.irclog.whitequark.org

2019-09-06 11:20 alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature

00:26 nlhowell has quit [Ping timeout: 246 seconds]

00:27 nlhowell has joined #panfrost

00:27 macc24 has quit [Ping timeout: 265 seconds]

00:37 stikonas has quit [Ping timeout: 265 seconds]

00:41 macc24 has joined #panfrost

02:11 mixfix41_ has joined #panfrost

03:04 vstehle has quit [Ping timeout: 246 seconds]

03:07 NeuroScr has quit [Quit: NeuroScr]

03:22 nerdboy has quit [Ping timeout: 256 seconds]

03:53 davidlt has joined #panfrost

04:00 robink has quit [Ping timeout: 265 seconds]

04:02 robink has joined #panfrost

04:25 buzzmarshall has quit [Remote host closed the connection]

05:00 vstehle has joined #panfrost

05:12 robert_ancell has quit [Ping timeout: 272 seconds]

05:13 macc24 has quit [Ping timeout: 265 seconds]

05:13 icecream95 has joined #panfrost

06:01 kaspter has quit [Quit: kaspter]

06:19 davidlt has quit [Remote host closed the connection]

06:32 mixfix41_ has quit [Ping timeout: 256 seconds]

06:49 davidlt has joined #panfrost

07:28 raster has joined #panfrost

07:59 cwabbott has joined #panfrost

08:43 cwabbott has quit [Quit: cwabbott]

08:43 cwabbott has joined #panfrost

09:02 cwabbott has quit [Quit: cwabbott]

09:03 cwabbott has joined #panfrost

09:05 kaspter has joined #panfrost

09:16 stikonas has joined #panfrost

09:28 stikonas_ has joined #panfrost

09:28 stikonas has quit [Ping timeout: 240 seconds]

09:48 <robmur01> HdkR: FWIW we've got Radeons in the eMAG and TX2 desktops at work, although that's not overly helpful just now :)

09:48 <robmur01> (I don't have the patience for remote desktop shenanigans)

09:54 <robmur01> mmind00: since the switch to generic OPP code we fail to actually change the regulator voltage ever, so for boards with kernel-controlled regulators it depends on how close the initial default is to that of the max OPP as to how wonky things get - Chromebooks seem worse off than most

09:56 <mmind00> robmur01: so it's a matter of devfreq acting up ... I don't remember seeing cpufreq-related reports though

09:56 <robmur01> there have been at least 3 attempts to fix it, but they all seem to get stuck in the mess of clock/regulator/OPP/devfreq optionality

09:59 <robmur01> why would panfrost driver changes affect cpufreq? :P

10:00 <mmind00> :-P ... I only read tidbits here yesterday, so was thinking if that was a opp-problem

10:01 <mmind00> [aka the TL;DR I deduced from the backlog yesterday was "panfrost broken on Kevin due to frequency scaling" ;-) ]

10:01 <robmur01> nope, it's a "panfrost fails to attach regulators to its own OPP table" problem ;)

10:13 macc24 has joined #panfrost

10:20 Elpaulo has quit [Quit: Elpaulo]

10:54 <shadeslayer> austriancoder: did you manage to get a trace visualized ?

10:58 <shadeslayer> austriancoder: https://ui.perfetto.dev/#!/ has a sample chrome/android trace, but if you want something more specific to Panfrost, here you go https://people.collabora.com/~shadeslayer/trace.protobuf

10:58 <shadeslayer> It's a little outdated though :)

11:03 <icecream95> shadeslayer: I have uploaded a Perfetto trace of glmark2-es2 at https://gitlab.freedesktop.org/snippets/1023

11:06 <shadeslayer> icecream95: amazing :)

11:08 <daniels> icecream95: thanks! is it useful to you at all?

11:12 <HdkR> robmur01: Dang, no Xavier to get ARMv8.1 I guess?

11:18 <robmur01> HdkR: pff, once N1SDP boards start turning up in numbers to replace the Junos it'll be v8.2 all the way... and then we wait and hope for Altra (and possibly KunPeng) :D

11:19 <HdkR> haha sure, I'm not saying Xavier is a good choice :P

11:20 <HdkR> Is the N1SDP board even something that will be available to purchase?

11:20 <robmur01> On the more affordable side, I believe Macchiatobins are a popular "stick a GPU card in it" board

11:21 <icecream95> daniels: I have been too busy hacking the Midgard instruction scheduler to use it much so far...

11:21 <HdkR> Dang, only A72 on those though

11:22 <daniels> icecream95: heh, that's cool :) what are you doing in the scheduler ooi?

11:24 <HdkR> Alternatively, just need Panfrost to support GL 3.3 :p

11:25 <icecream95> HdkR: MESA_GL_VERSION_OVERRIDE=3.3 MESA_GLSL_VERSION_OVERRIDE=330 PAN_MESA_DEBUG=gles3

11:25 <HdkR> er, Bifrost GL 3.3 for SoCs that support ARMv8.1*

11:27 <icecream95> HdkR: I'm sure you could very carefully remove the Bifrost GPU and glue in a Midgard one and everything would still work. :P

11:27 <HdkR> Those atomics are too good to live without :)

11:27 <robmur01> N1SDP> dunno - as far as I'm aware the original intent was a very-limited-scope CCIX demonstration platform, but I've since heard mumblings that there *might* be some shift to productise it at some point

11:28 <HdkR> dang

11:30 <robmur01> I wouldn't worry - if you don't care about CCIX then it's basically just 4 2-and-bit GHz cores plus a handful of PCIe lanes for ~$n000 ;)

11:35 <HdkR> hm

11:35 <robmur01> if you want a cheap v8.2 platform right now, consider bashing your head against S905X3

11:38 <HdkR> I have a couple of the ODROID-C4 boards, just can't use that for sticking a dGPU on it :P

11:39 <HdkR> And A55 isn't a great perf target...

11:39 <icecream95> daniels: https://gitlab.freedesktop.org/snippets/1029

11:40 nlhowell has quit [Ping timeout: 256 seconds]

11:40 <HdkR> Really I'm probably going to be waiting for an Nvidia Orin dev board, which is sad to say

11:40 <robmur01> yeah, Cortex-A55 + usable PCIe is probably an unlikely combination, except perhaps for high-core-count networking stuff

11:41 <HdkR> Especially with Orin being slated for 2022 and being Nvidia GPU. So no Bifrost/Valhall fun :P

11:50 <daniels> icecream95: heh! that's neat!

11:50 <HdkR> Looks like my best bet over the next few months is buying another Xavier and cringing and performance numbers though

11:51 <HdkR> cringing at performance numbers*

11:53 * HdkR stacks JITs

11:54 <daniels> you can take the boy out of NVIDIA ...

11:54 nlhowell has joined #panfrost

11:55 <HdkR> haha

11:56 <HdkR> Sadly nobody makes Exynos devboards anymore, which would have been fun targets :)

12:03 <HdkR> I guess I just have unreasonable performance desires

12:05 <daniels> not interested in Snapdragon for perf?

12:05 <robmur01> what is the "performance" you speak of? This is 2020, where 'hello world' is 300MB of packaged standalone JavaScript environment...

12:05 <daniels> you can actually get pretty reasonable performance out of those now

12:05 <daniels> er. *actually get pretty reasonably-priced devboards

12:06 <HdkR> I'd like the Snapdragon 865 dev board if it was a reasonable Linux target :P

12:06 <daniels> we gave up on Exynos long before they gave up on actually selling the silicon to anyone else - a few iterations of us fixing Exynos for mainline in one kernel release, then Samsung breaking it for everyone apart from Tizen in the very next release, was pretty demoralising

12:07 <HdkR> yea, I saw that over the years. Such a pain

12:10 <robmur01> FWIW, I can vouch for the performance of SDM835 running x86 Thunderbird being somewhat less than "reasonable" (cue stall for ~10s in the middle of typing this...)

12:11 <daniels> HdkR: not to try to talk you out of Panfrost or anything, but :P there are patches out there atm for 865 display & GPU

12:11 <HdkR> My Snapdragon 850 device destroys my Snapdragon 8cx device in unit test run time. But that's just because of WSL being terrible on the 8cx and running real linux on the 850 device

12:12 <HdkR> daniels: Oh yea, I saw that! Going to be a good time soon there

12:14 <HdkR> Freedreno I've already confirmed that it runs fine in an x86-64 environment, going to need to ensure Panfrost userspace also works in the same environment at some point :)

12:18 nlhowell has quit [Ping timeout: 256 seconds]

12:19 <HdkR> (Just need to test radv/radeonsi as well)

12:19 <robmur01> Just need to track down one of these... (according to wikipedia) https://ark.intel.com/content/www/us/en/ark/products/codename/80013/sofia-lte.html

12:19 <HdkR> no, gods no

12:20 <robmur01> I shudder to think what the integration of T720 into "everything is PCI" world looks like

12:20 nlhowell has joined #panfrost

12:22 <HdkR> Main thing is testing the kernel API (In AArch64) can communicate with the userspace (In x86-64), shouldn't be a big deal? :)

12:23 <HdkR> Lets us do fun stuff like https://twitter.com/Sonicadvance1/status/1264322509433794560

12:25 <daniels> robmur01: cursed

12:30 icecream95 has quit [Quit: leaving]

12:32 <HdkR> Sofia, super cursed

12:48 nlhowell has quit [Ping timeout: 256 seconds]

13:00 <robmur01> whoop-de-do, another year, another GPU tick... how very unexciting :D

13:26 <HdkR> https://www.anandtech.com/show/15813/arm-cortex-a78-cortex-x1-cpu-ip-diverging ooo, Just what I've been wanting

13:39 <HdkR> Mali-G78, sounds like a good time for more Valhall

13:40 <HdkR> Can't tell if the 24core unit can get near Adreno top end perf

14:14 buzzmarshall has joined #panfrost

14:26 <alyssa> "up to some more radical changes such as a complete redesign of its FMA units."

14:37 <alyssa> robmur01: "The one key changed of the Mali-G78 that Arm had talked about the most, was the change from a single global frequency domain for the whole GPU to a new two-tier hierarchy, with decoupled frequency domains between the top-level shared GPU blocks, and the actual shader cores."

14:37 <alyssa> I just read this as more devfreq bugs for us down the line.

14:37 <alyssa> Bugs are O(N^2) to complexity IME ;)

14:39 stikonas_ has quit [Remote host closed the connection]

14:40 macc24 has quit [Read error: Connection reset by peer]

14:40 macc24 has joined #panfrost

14:46 raster has quit [Quit: Gettin' stinky!]

14:52 nlhowell has joined #panfrost

14:53 raster has joined #panfrost

15:03 <robmur01> yeah, I can't even imagine off-hand how you'd even present the OPP tables for that, and I do wonder whether software is expected to forecast shader vs. tiler load for itself :/

15:06 <alyssa> bbrezillon: what's the idea for BO_ACCESS_VERTEX/FRAGMENT flags?

15:07 <alyssa> oh, for the dep graph later. got it.

15:07 <bbrezillon> alyssa: yep, knowing which one is used in the frag job

15:08 <bbrezillon> and which ones are used in the !frag job

15:08 * alyssa is looking into refactoring away the hash tables so

15:09 <bbrezillon> alyssa: what's the key of this hashtab?

15:10 <alyssa> bbrezillon: currently, we have a lot indexed by panfrost_bo * with a hash table

15:10 <alyssa> when we can get away with bo->gem_handle into an array/bitset/etc

15:10 <alyssa> (see discussion with jekstrand in dri-devel yesterday - this is how it's handled in anv)

15:10 <bbrezillon> yep, I saw that one

15:10 <bbrezillon> and that sounds like a good idea, indeed

15:11 <bbrezillon> sounds similar to the xarray concept we have in the kernel https://www.kernel.org/doc/html/latest/core-api/xarray.html

15:19 <bbrezillon> alyssa: maybe something that should be made generic so others can easily re-use the same concept (or is it already the case)

15:23 <alyssa> mayhaps

15:25 <cwabbott> bbrezillon: there's already a lockless sparse array implementation in mesa

15:25 <alyssa> (my branch uses that)

15:30 <bbrezillon> cool

15:47 <alyssa> So many corner cases though

15:56 cwabbott has quit [Quit: cwabbott]

15:57 cwabbott has joined #panfrost

16:09 <alyssa> 3 files changed, 33 insertions(+), 52 deletions(-)

16:09 <alyssa> So far so good :-)

16:16 <alyssa> bbrezillon: I don't see where PAN_BO_ACCESS_FRAGMENT is read, though

16:22 <alyssa> (It looks like we only track deps on a per-batch level)

16:24 <alyssa> and batch_submit_ioctl only uses the deps for the v/t side

16:25 * alyssa wonders if we're losing perf there

16:39 <bbrezillon> alyssa: yep, it's a per-batch thing

16:39 <bbrezillon> inter-batch dep is handled through BOs

16:40 <bbrezillon> I mean FBs, not BOs

16:45 <bbrezillon> alyssa: the dep of a fragment job, is the V/T job

16:46 <bbrezillon> which already has deps on other jobs defined

16:47 <bbrezillon> so the frag job indirectly depends on the V/T deps

16:47 <bbrezillon> is that wrong?

16:49 <bbrezillon> note that BO_ACCESS flags are here for resource refcounting, not deps

16:49 raster has quit [Quit: Gettin' stinky!]

16:50 marex-cloud has quit [Ping timeout: 256 seconds]

16:51 <bbrezillon> well, they also act as implicit deps, since the kernel driver waits for all referenced BOs to be idle before schedule a job

16:51 <bbrezillon> *scheduling

16:59 <alyssa> bbrezillon: Suppose batch A renders a cat to FBO #1.

17:00 <alyssa> Then batch B renders a fullscreen quad (so no deps in vertex/tiler) which in the fragment shader textures from FBO #1 to do some post-processing to make the cat rainbow and bounce and say nyan.

17:00 <alyssa> Ideally we would have:

17:00 <alyssa> VERTEX: [ A ] [ B ]

17:01 <alyssa> FRAGME: [ A ] [ B ]

17:01 <alyssa> since the vertex job of B does not depend on the fragment job of A, they can run concurrent

17:02 <alyssa> If I understand the code right, though, it would actually end up being

17:02 <alyssa> VERTEX: [ A ] [ B ]

17:02 <alyssa> FRAGME: [ A ] [ B ]

17:02 <alyssa> which is slower due to the unnecessary dep.

17:03 <alyssa> The ACCESS flags would signal that that's unnecessary, but I don't see how the kernel would know since it just sees a dep of B on A, and it just sees B accesses a BO written from A (the FBO)

17:09 <bbrezillon> right, I forgot that the tiler job was not responsible for texture sampling

17:10 <bbrezillon> so we could indeed remove this dep

17:10 <alyssa> (it's a bit confusing -- TILER jobs specify all the fragment shaders but they don't actually run until FRAGMENT)

17:11 <Lyude> alyssa: you working on midgard perf stuff?

17:11 <alyssa> Lyude: Yeah :-)

17:11 <bbrezillon> yep, I think last time we discussed that you said tiler jobs were referencing textures, which is why I thought there was a hard dep here

17:12 <alyssa> yeah, it's tricky. the TILER job does reference it in the sense that the job has the pointer, but it doesn't actually access it

17:12 <bbrezillon> if that's not the case, then we should remove the explicit dep on BOs flagged with BO_ACCESS_FRAGMENT only

17:12 <bbrezillon> we'd still pass the BO to the BO list, that's not a problem

17:13 <bbrezillon> we can just get rid of the dep

17:13 <alyssa> would that work if the kernel does implicit deps from the BO list..?

17:13 <bbrezillon> anyway, none of that will help improving the perfs if the kernel is not patched to support skipping the implicit waits on BOs

17:14 <bbrezillon> which was in the pipe when I submitted the batch pipelining stuff

17:14 <alyssa> I'm not convinced we need to specify that texture in the vertex/tiler BO list at all, though

17:15 <alyssa> When I say it's a pointed, I literally just mean it's a pointer. It shouldn't ever get dereferenced by the GPU until the corresponding frag job executes.

17:15 <alyssa> pointer

17:15 <alyssa> Not sure if that's a kosher use of the BO list, but it should work at this point

17:15 <alyssa> and then it becomes UABI by default or something

17:15 <alyssa> robher: *ducks*

17:16 <bbrezillon> alyssa: I wouldn't worry about that, the BO is still referenced by the frag job

17:16 <bbrezillon> which is executed after the tiler job is done

17:17 <bbrezillon> so omitting the BO in the tiler BO list shouldn't be a problem

17:17 <alyssa> agreed, just not sure it's totally intended :)

17:17 <bbrezillon> probably not

17:17 <bbrezillon> but adding a flag to skip the implicit deps would also be a good thing

17:18 <bbrezillon> I mean, etnaviv has that too

17:18 <alyssa> Mm

17:20 <bbrezillon> don't you have cases where 2 jobs read from the same BO but never write it?

17:20 <bbrezillon> clearly we don't want things to be serialized in this case

17:20 <bbrezillon> but that's what happens

17:20 <alyssa> ah, right. good point

17:21 <bbrezillon> we have all the pieces to skip this unneccessary serialization already

17:21 <bbrezillon> we just need this flag (and a lot of testing to make sure it doesn't regress things :))

17:21 <alyssa> testing? don't you mean pushing to master and waiting for the bug reports?

17:21 <alyssa> (thanks icecream95 ;P)

17:22 <bbrezillon> :D

17:32 <alyssa> bbrezillon: As an aside, I notice we spend serious CPU time in the SUBMIT ioctl.. wonder what's up with that

17:32 <alyssa> 9.21% on this trace in panfrost_ioctl_submit

17:33 <alyssa> within that 4.01% in panfrost_job_push, 1.23% in drm_gem..lookup, 1.18% in gem_mapping_get

17:33 <alyssa> 1.26% waiting on the wake up lock in drm_sched_wakeup

17:38 nerdboy has joined #panfrost

17:40 <alyssa> Maybe we're using way too many BOs

17:41 stikonas has joined #panfrost

17:50 <robher> I seem to recall some discussion on multiple readers. Related to resv_obj's I think.

17:52 <bbrezillon> I had a branch with a flag similar to https://elixir.bootlin.com/linux/v5.7-rc7/source/include/uapi/drm/etnaviv_drm.h#L183

17:52 <bbrezillon> b

17:52 <bbrezillon> but I can't find it

17:54 <bbrezillon> oh, no actually it was about flagging access types on BO

17:56 <bbrezillon> https://github.com/bbrezillon/linux/commits/panfrost-test

17:56 <bbrezillon> robher, alyssa: ^

17:57 <bbrezillon> https://github.com/bbrezillon/linux/commit/10f67059579f517f234cecc26657cd640f831cb8

17:58 <alyssa> DRM_IOCTL_PANFROST_CREATE_BO failed: No space left on device

17:58 <alyssa> ^ this seems bad

17:58 <bbrezillon> BO leak :)

17:58 <alyssa> yeah, but... why..

17:58 <bbrezillon> that's where shadeslayer's BO labeling could help :)

17:59 <alyssa> indeed

17:59 <alyssa> https://people.collabora.com/~alyssa/0001-panfrost-Defer-BO-unrefencing-until-batch-is-signall.patch

17:59 <alyssa> (^ the patch)

18:02 <alyssa> Oh wait

18:02 <alyssa> er no

18:04 <alyssa> also drm_syncobj_wait_ioctl is eating tremendous CPU uhh

18:06 <alyssa> am I doing something silly

18:30 NeuroScr has joined #panfrost

18:54 * alyssa is simplifying along..

18:54 <alyssa> TBD if it helps perf but shouldn't hurt, and should make things easier to follow.

18:54 <alyssa> and thus easier to fix for real perf things

19:27 nlhowell has quit [Ping timeout: 240 seconds]

19:32 <bbrezillon> alyssa: looks good to me (s/gaurantees/guarantees/)

19:35 <alyssa> bbrezillon: Thats the patch that's breaking the world :-)

19:37 davidlt has quit [Ping timeout: 256 seconds]

19:43 <bbrezillon> alyssa: well, it looked good :)

19:44 <alyssa> bbrezillon: :D

19:44 <alyssa> (`bo-v4` is what I'm working on. Nothing dramatic yet.)

19:47 <bbrezillon> alyssa: you probably want to release the BOs as soon as the fence is signalled

19:47 <alyssa> i'll try that

19:49 <alyssa> bbrezillon: nope, still not happy..

19:51 <alyssa> I'm suspicious of u_blitter interactions

19:52 <bbrezillon> alyssa: so BOs are not released as they should ne

19:52 <bbrezillon> be

19:52 <bbrezillon> meaning that some fences are never signalled

19:53 <bbrezillon> or never tested

19:54 <bbrezillon> wait, what's the data of the hashtab?

19:55 <bbrezillon> don't we have a circular dep here (BO entry holding a reference on the fence which holds a reference on the BO)?

20:03 <alyssa> uhm

20:04 <alyssa> bbrezillon: So we do. :|

20:05 <bbrezillon> yeah, it's not that simple I fear

20:05 <alyssa> how did this work before :p

20:06 <bbrezillon> because there was no circular dep :P

20:06 <alyssa> thanks

20:06 <alyssa> :p

20:08 <alyssa> oh. right. fine.

20:08 <alyssa> :p

20:10 <alyssa> Okay, what if I keep the structure as is, but have a set of gem handles on the fence (no referencing)?

20:10 <alyssa> dereference at the usual time

20:10 <alyssa> but keep an in_flight refcnt on the BO

20:11 <alyssa> is that still circular

20:11 <bbrezillon> I was about to propose having a weak ref on the ->accessed_bos hashtabl

20:11 <alyssa> (I got rid of ->accessed_bos a few patches ago, whoops?)

20:11 <bbrezillon> *on BOs inserted in the ->accessed_bos hashtab

20:11 <alyssa> but yes, I think that would also work

20:12 <bbrezillon> yep, but you did replace it by something else

20:12 <alyssa> a929ad7adacbd83f402ef890e0cf92043389a1e4

20:12 <bbrezillon> which holds a ref on the BOs inserted there

20:15 <bbrezillon> yes, so you need to be very careful here, since I'd expect the BO users to have a ref on the BOs they use

20:16 <alyssa> right

20:16 <alyssa> this is why i write compilers :p

20:16 <bbrezillon> and at the same time, the readers/writer arrays hold refs to those users

20:16 <alyssa> we can probably get rid of those refs..?

20:17 <bbrezillon> maybe :)

20:19 <alyssa> or not argh

20:20 <bbrezillon> can't we move to those smart-arrays without changing the structs relationships?

20:20 <alyssa> smart-arrays?

20:21 <bbrezillon> well, the replacement for hashtables

20:21 <alyssa> Oh, right

20:22 <alyssa> Is this about getting rid of accessed_bo or fussing with the BO array?

20:22 <bbrezillon> I feel like addressing all problems at once is complicating things quite a bit

20:22 <alyssa> I do have that problem quite a bit yes

20:22 <bbrezillon> it's mostly about isolating changes so we can easily debug each problem independently

20:24 <bbrezillon> and yes, having direct relationship between accesses and BOs is likely to cause many circular dep issues

20:24 * alyssa nods

20:24 <alyssa> Trying a much smaller subproblem then, let's see.

20:24 <bbrezillon> we probably want to address both ultimately

20:28 nlhowell has joined #panfrost

20:32 <alyssa> okay. replicated the experiment with a much simpler change set. still leaking, so there's some other issue somewhere.

20:34 <bbrezillon> alyssa: same branch?

20:34 <alyssa> bbrezillon: just pushed to `bob`

20:34 <alyssa> I am ahem creative with branch names ;P

20:35 <alyssa> Indeed.. there are BOs hit in free_batch's bo check that are not hit the right number of times in is_signaled's fence_bo check

20:36 <alyssa> so I guess some fences aren't being signaeld

20:36 <alyssa> (or checked)

20:38 <bbrezillon> I'd have to look at it more closely, but I fear attaching the readers/write directly to the BO has an impact on refcounting

20:38 <alyssa> Possibly.

20:39 <alyssa> rebasing w/o my other changes

20:39 <alyssa> nope

20:40 <bbrezillon> do you have fence leaks without your changes?

20:41 <alyssa> let's find otu.

20:42 <alyssa> .Yes.

20:43 <bbrezillon> duh

20:43 <alyssa> Er, no

20:43 <alyssa> Er, yes, just somewhat slower.

20:43 <alyssa> duh?

20:44 <alyssa> answer is yes

20:44 <bbrezillon> leak as in valgrind reporting a leak, or as in the number of fences keeps increasing

20:44 <alyssa> https://people.collabora.com/~alyssa/0001-REPRO.patch

20:44 <alyssa> In -bideas that bounces between 0 and 1, which is expected

20:44 <alyssa> In -bterrain (which makes heavy use of u_blitter), that increases unbounded.

20:44 <alyssa> (the leaks I'm chasing are in terrain)

20:45 <alyssa> also leaks very quickly in -bdesktop, which does not mipmap afaik, but uses u_blitter for walpapering, as does terrain

20:45 <alyssa> so I bet it's wallpapering

20:45 <alyssa> (but then I'd expect to see the leak in weston, and I don't)

20:46 <alyssa> also seen in -brefract which doesn't use u_blitter, so that's a red herring

20:46 <alyssa> it's just FBO stuff then

20:46 <alyssa> (weston doesn't need FBOs)

20:49 <alyssa> --off-screen doesn't reproduce, though. So it's texturnig from an FBO that's a problem

20:49 <alyssa> read-after-write

20:49 <bbrezillon> can you run valgrind on it?

20:50 <alyssa> any flag to get valgrind to be useful here?

20:50 <bbrezillon> dunno

20:50 <bbrezillon> I not super familiar with valgrind

20:50 <bbrezillon> other than the basics

20:51 <bbrezillon> you can also check if the leaked refs are those where a fence is explicitly requested

20:52 <bbrezillon> (*fence not NULL in ctx->flush())

20:53 <bbrezillon> and it's really read-after-write that's at fault, it could also be a bad refcounting when we deal with inter-batch deps

20:53 <alyssa> `ure

20:53 <alyssa> fence == NULL in flush() for this reproducer

20:57 <alyssa> Got it.

20:58 <bbrezillon> I'm curious :)

20:58 <alyssa> bob-v2

20:58 <alyssa> spoiler alert: access->writer

21:01 <bbrezillon> ouch

21:01 <bbrezillon> hm, wait

21:02 <bbrezillon> no, that's correct

21:02 <alyssa> The original or the fix?

21:03 <bbrezillon> the fix

21:03 <alyssa> cool

21:04 <alyssa> Better question - why does an unrelated change in the BO cache cause fences to leak again

21:05 <bbrezillon> I guess I'll find the answer in bob-v3 :à

21:05 <bbrezillon> :)

21:12 <alyssa> :)

21:15 <alyssa> maybe the issue in terrain now isn't a leak, just a legitimate OOM :

21:15 <alyssa> :|

21:35 <HdkR> Time to swap your GPU buffers to disk :D

21:38 NeuroScr has quit [Remote host closed the connection]

21:39 NeuroScr has joined #panfrost

21:46 cphealy has quit [Remote host closed the connection]

21:54 <alyssa> HdkR: Oof.

21:54 <alyssa> Or time to switch gears to Bifrost :P

22:22 warpme_ has quit [Quit: Connection closed for inactivity]

23:25 macc24 has quit [Quit: WeeChat 2.8]

23:40 nerdboy has quit [Ping timeout: 265 seconds]

23:48 <alyssa> chewitt: Mind giving alyssa/mesa:bi-format a whirl on Kodi?

23:49 <alyssa> I don't have a convenient way to test kodi rn but iirc we saw this issue with midgard way back when and a similar patchset fixed it

23:51 cwabbott has quit [Ping timeout: 272 seconds]

23:52 <alyssa> although I'm still seeing corruption in weston in a few places so maybe tomorrow night instead ;)