#videocore on 2021-03-10 — irc logs at freenode.irclog.whitequark.org

2021-02-26 10:57 nroberts changed the topic of #videocore to: Raspberry Pi Mesa drivers discussion - Logs http://freenode.irclog.whitequark.org/videocore

02:47 jcea has quit [Ping timeout: 260 seconds]

03:53 hijacker has quit [Ping timeout: 276 seconds]

06:32 itoral has joined #videocore

08:10 <itoral> apinheiro[m]1: I was testing the driver with the latest changes (including my latest ldvary series) with a few of our usual apps and it does seem to lead to some fps gains, particularly comparing with previous records I had for vkQuake3, I was getting significantly better fps. I also see some smaller gains with Sponza demo.

08:11 <itoral> I am surprised at the vkQuake3 results, I was not expecting to see significant changes there, since I think my last records come from the inital ldvary work I did

08:11 <itoral> but I'll take it :)

10:09 <apinheiro[m]1> <itoral "apinheiro: I was testing the dri"> oh nice

10:10 <apinheiro[m]1> hmm, it is a pity that I didn't save the fps of the traces we have around per-week

10:10 <apinheiro[m]1> so we could track the evolution

10:10 <apinheiro[m]1> hmm, although I have some numbers from last week

10:10 <itoral> well, we can always go back in time with git reset :)

10:11 <itoral> I'd like to write a post discussing some of the perf work we have doing on the compiler lately, and I'll compare shader-db stats from some time in January more or less and current

10:11 <apinheiro[m]1> itoral: yes, but that means that then you need to do all the measures, today and last-week point

10:12 <apinheiro[m]1> <itoral "I'd like to write a post discuss"> if you want an after and before

10:12 <itoral> yes, what I mean, is that once we do them, we can keep them in the records

10:12 <apinheiro[m]1> a good point is with that patch that got the ue4 demo working, for example

10:12 <apinheiro[m]1> as that was aproximately the starting point of the performance work

10:12 <itoral> what patch?

10:13 <itoral> the ue4 demo has always been working

10:13 <apinheiro[m]1> itoral: no, you needed to add a fix

10:13 <itoral> what fix?

10:13 <apinheiro[m]1> or in some situations it crashed

10:13 <apinheiro[m]1> let me find the patch

10:14 <itoral> when Ken talked to us about it, it was already working

10:14 <itoral> there was something we did that then broke

10:14 <itoral> the demo, and then we fixed it

10:14 * apinheiro[m]1 sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/gHcjkAZTRtprcvAMKhRrFSFk/message.txt >

10:14 <itoral> it was related to the TFU work

10:15 <itoral> but what is that relevant to this discussion?

10:15 <itoral> s/what/why

10:15 <apinheiro[m]1> itoral: because trying to measure performance using the ue4 before that patch is not practical

10:15 <itoral> that was just a temporary regression that our CI did not catch

10:15 <itoral> yes it is

10:15 <itoral> becuase the demo was already working before it

10:15 <itoral> before we did the TFU work I mean

10:16 <apinheiro[m]1> but then it regressed as you said

10:16 <itoral> it was only for a very short time that it regressed

10:16 <apinheiro[m]1> and in any case, most of the performance work you did was after that

10:16 <itoral> yeah, I was not going to go that far in any case

10:16 <apinheiro[m]1> so imho, it is a better reference point of "before"

10:16 <apinheiro[m]1> that going before the regression

10:16 hijacker has joined #videocore

10:16 <apinheiro[m]1> <itoral "yeah, I was not going to go that"> ok

10:17 <itoral> and when it regressed it just didnt work, so it is easy to see that and just go a bit further back or forward

10:17 <itoral> I still don't think that is an issue to collect historical data

10:17 <apinheiro[m]1> it is far easier using that one as reference, becuase I already know about that, so I don't need to keep searching ;)

10:17 <apinheiro[m]1> in any case

10:17 <apinheiro[m]1> if in the end you are interested on some "before numbers"

10:18 <apinheiro[m]1> but with a different reference commit

10:18 <apinheiro[m]1> just ping me and I will track numbers with the traces we have

10:19 <itoral> sure, I'll keep that in mind, however, gfx-reconstruct traces didn't work until very recently, so dunno...

10:23 <apinheiro[m]1> itoral: hmm, but afair, most of the issues that we had with gfx-reconstruct were fixed on gfx-reconstruct itself, like the COHERENT vs CACHED memory thing

10:23 <apinheiro[m]1> or the get event status thing

10:23 <apinheiro[m]1> I don't remember doing anything on the driver, but perhaps my memory fails

10:24 <apinheiro[m]1> also, again with my "reference patch"

10:24 <apinheiro[m]1> I remember using the ue4 traces we have with that commit, and working

10:35 <itoral> wasn't there a problem with memory usage that I fixed?

10:35 <itoral> I remember doing something about thta

10:36 <itoral> apinheiro[m]1 ^

10:36 chema[m] has quit [Quit: Bridge terminating on SIGTERM]

10:36 abergmeier has quit [Quit: Bridge terminating on SIGTERM]

10:36 <itoral> that we would allocate too many BOs in some cases

10:36 apinheiro[m]1 has quit [Quit: Bridge terminating on SIGTERM]

10:36 jasuarez has quit [Quit: Bridge terminating on SIGTERM]

10:38 apinheiro[m]1 has joined #videocore

10:38 <apinheiro[m]1> itoral: I think that was for renderdoc

10:38 <apinheiro[m]1> renderdoc needed two or three fixes on the driver

10:38 <itoral> ah, okay, maybe it was that then

10:42 txenoo has joined #videocore

10:46 chema[m] has joined #videocore

10:46 <chema[m]> I've been checking, and we need to update shaderdb support in VC4. With master, it doesn't record the stats.

10:46 jasuarez has joined #videocore

10:53 <itoral> apinheiro[m]1: those regressions you observed with the stride patch for buffer copies, did you do the runs in the test runs in the device or the sim?

10:54 <itoral> if you did them on a device, did you check dmesg?

10:55 <apinheiro[m]1> <itoral "apinheiro: those regressions you"> the device, I stopped to do full runs on the sim a long time ago (as now is slower, we needed to skip sync tests, and it is "less real")

10:57 <apinheiro[m]1> <itoral "if you did them on a device, did"> checked, but nothing relevant

11:06 abergmeier has joined #videocore

11:16 <itoral> apinheiro[m]1: I think maybe it would be a good idea to do a sim run with that patch to see if the sim complains about something, since the regressed tests don't do any buffer copies it looks like some other test is what is causing them to fail

11:17 <apinheiro[m]1> itoral: ok, I will set a full run on the sim

11:17 <apinheiro[m]1> although I think that the sanitized vk-default.txt is somewhat abandoned these deays

11:17 <apinheiro[m]1> *days

12:52 <apinheiro[m]1> itoral: btw, with all this cache work, I have been also thinking on that idea of reusing the same bo for the assembly of all the shaders

12:52 <apinheiro[m]1> with this work, that will be easier

12:52 <apinheiro[m]1> the tricky part would be when to freed the shared bo

12:52 <itoral> I suppose we would need to have the pipelines refcount the BOs

12:53 <apinheiro[m]1> because we can't just free it with the pipeline

12:53 <itoral> and we would free on the last unref

12:53 <apinheiro[m]1> well, yes that it was I was thinking

12:54 <apinheiro[m]1> but as this is the only case that would need that, seems somewhat an overkill to add ref_count directly on v3dv_bo

12:54 <apinheiro[m]1> so perhaps a wrapper?

12:54 <itoral> maybe...

12:54 <apinheiro[m]1> the other solution is that right now on the pipeline_cache we have a "cache_entry"

12:55 <apinheiro[m]1> that is basically a struct that contains all the data we are caching from the pipeline

12:55 <apinheiro[m]1> in order to understand: don't know if you remember on the review, that my first approach saved the descriptor maps on the individual variants

12:55 <apinheiro[m]1> now it is saved on that "cache_entry", so only once

12:55 <itoral> aha

12:56 <apinheiro[m]1> but right now that cache_entry is a implementation detail of pipeline_cache

12:56 <apinheiro[m]1> and it already have ref_count

12:56 <apinheiro[m]1> so an alternative, would be to make it public, perhaps rename it to something like "pipeline_cacheable_data"

12:56 <apinheiro[m]1> and include the common bo there

12:57 <itoral> what if we have disabled the pipeline cache?

12:57 <apinheiro[m]1> that is the main reason to rename it from "cache_entry" to something else

12:57 <apinheiro[m]1> the pipeline would need to maintain a reference to it

12:58 <apinheiro[m]1> the only difference when we disable the pipeline cache

12:58 <itoral> ok, I get it

12:58 <apinheiro[m]1> is that this info will not be upload to the cache

12:58 <apinheiro[m]1> *uploaded

12:58 <apinheiro[m]1> so with cache: pipeline has a reference, cache another

12:58 <apinheiro[m]1> without cache: pipeline has a reference

12:58 <itoral> right

12:59 <apinheiro[m]1> well in fact

12:59 <apinheiro[m]1> so with cache: pipeline has a reference, any cache (could be more than one) another

12:59 <apinheiro[m]1> the advantage of this is that we avoid adding ref_count support to a new struct

13:00 <itoral> yes, I understand, I think that could work

13:00 <apinheiro[m]1> the disadvantage is that now the pipeline needs to handle this struct

13:00 <apinheiro[m]1> right now it is just a cache thing

13:00 <itoral> but all the info in that entry is still in the pipeline anyway no?

13:01 <itoral> my understanding is that by having this struct in the pipeline

13:01 <itoral> we may be able to drop some of the pipeline fields

13:01 <itoral> that would be a duplicate of this info

13:01 <itoral> no?

13:02 <apinheiro[m]1> well, not drop, but move

13:02 <itoral> not sure I follow, if you have entry.x and pipeline.x

13:02 <apinheiro[m]1> well, yes, but now you have pipeline.x

13:02 <itoral> then if you have pipeline.entry you no longer need pipeline.c

13:02 <itoral> then if you have pipeline.entry you no longer need pipeline.x

13:02 <apinheiro[m]1> and then you will have pipeline.entry.x

13:03 <itoral> exactly

13:03 <itoral> so you can drop the field from pipeline

13:03 <apinheiro[m]1> so move in the sense that instead of direct fields, will be

13:03 <itoral> which was my point

13:03 <apinheiro[m]1> hmm

13:03 <apinheiro[m]1> subfields?

13:03 <itoral> in any case, if we can do that, I think that solution would work fine

13:03 <apinheiro[m]1> ok, thanks

13:04 itoral has quit [Quit: Leaving]

13:06 jcea has joined #videocore

20:12 foka_ has quit [Ping timeout: 272 seconds]

21:18 foka has joined #videocore

23:17 jcea has quit [Ping timeout: 260 seconds]

23:26 jcea has joined #videocore