<itoral>
apinheiro[m]1: I was testing the driver with the latest changes (including my latest ldvary series) with a few of our usual apps and it does seem to lead to some fps gains, particularly comparing with previous records I had for vkQuake3, I was getting significantly better fps. I also see some smaller gains with Sponza demo.
<itoral>
I am surprised at the vkQuake3 results, I was not expecting to see significant changes there, since I think my last records come from the inital ldvary work I did
<itoral>
but I'll take it :)
<apinheiro[m]1>
<itoral "apinheiro: I was testing the dri"> oh nice
<apinheiro[m]1>
hmm, it is a pity that I didn't save the fps of the traces we have around per-week
<apinheiro[m]1>
so we could track the evolution
<apinheiro[m]1>
hmm, although I have some numbers from last week
<itoral>
well, we can always go back in time with git reset :)
<itoral>
I'd like to write a post discussing some of the perf work we have doing on the compiler lately, and I'll compare shader-db stats from some time in January more or less and current
<apinheiro[m]1>
itoral: yes, but that means that then you need to do all the measures, today and last-week point
<apinheiro[m]1>
<itoral "I'd like to write a post discuss"> if you want an after and before
<itoral>
yes, what I mean, is that once we do them, we can keep them in the records
<apinheiro[m]1>
a good point is with that patch that got the ue4 demo working, for example
<apinheiro[m]1>
as that was aproximately the starting point of the performance work
<itoral>
what patch?
<itoral>
the ue4 demo has always been working
<apinheiro[m]1>
itoral: no, you needed to add a fix
<itoral>
what fix?
<apinheiro[m]1>
or in some situations it crashed
<apinheiro[m]1>
let me find the patch
<itoral>
when Ken talked to us about it, it was already working
<itoral>
there was something we did that then broke
<itoral>
but what is that relevant to this discussion?
<itoral>
s/what/why
<apinheiro[m]1>
itoral: because trying to measure performance using the ue4 before that patch is not practical
<itoral>
that was just a temporary regression that our CI did not catch
<itoral>
yes it is
<itoral>
becuase the demo was already working before it
<itoral>
before we did the TFU work I mean
<apinheiro[m]1>
but then it regressed as you said
<itoral>
it was only for a very short time that it regressed
<apinheiro[m]1>
and in any case, most of the performance work you did was after that
<itoral>
yeah, I was not going to go that far in any case
<apinheiro[m]1>
so imho, it is a better reference point of "before"
<apinheiro[m]1>
that going before the regression
hijacker has joined #videocore
<apinheiro[m]1>
<itoral "yeah, I was not going to go that"> ok
<itoral>
and when it regressed it just didnt work, so it is easy to see that and just go a bit further back or forward
<itoral>
I still don't think that is an issue to collect historical data
<apinheiro[m]1>
it is far easier using that one as reference, becuase I already know about that, so I don't need to keep searching ;)
<apinheiro[m]1>
in any case
<apinheiro[m]1>
if in the end you are interested on some "before numbers"
<apinheiro[m]1>
but with a different reference commit
<apinheiro[m]1>
just ping me and I will track numbers with the traces we have
<itoral>
sure, I'll keep that in mind, however, gfx-reconstruct traces didn't work until very recently, so dunno...
<apinheiro[m]1>
itoral: hmm, but afair, most of the issues that we had with gfx-reconstruct were fixed on gfx-reconstruct itself, like the COHERENT vs CACHED memory thing
<apinheiro[m]1>
or the get event status thing
<apinheiro[m]1>
I don't remember doing anything on the driver, but perhaps my memory fails
<apinheiro[m]1>
also, again with my "reference patch"
<apinheiro[m]1>
I remember using the ue4 traces we have with that commit, and working
<itoral>
wasn't there a problem with memory usage that I fixed?
<itoral>
I remember doing something about thta
<itoral>
apinheiro[m]1 ^
chema[m] has quit [Quit: Bridge terminating on SIGTERM]
abergmeier has quit [Quit: Bridge terminating on SIGTERM]
<itoral>
that we would allocate too many BOs in some cases
apinheiro[m]1 has quit [Quit: Bridge terminating on SIGTERM]
jasuarez has quit [Quit: Bridge terminating on SIGTERM]
apinheiro[m]1 has joined #videocore
<apinheiro[m]1>
itoral: I think that was for renderdoc
<apinheiro[m]1>
renderdoc needed two or three fixes on the driver
<itoral>
ah, okay, maybe it was that then
txenoo has joined #videocore
chema[m] has joined #videocore
<chema[m]>
I've been checking, and we need to update shaderdb support in VC4. With master, it doesn't record the stats.
jasuarez has joined #videocore
<itoral>
apinheiro[m]1: those regressions you observed with the stride patch for buffer copies, did you do the runs in the test runs in the device or the sim?
<itoral>
if you did them on a device, did you check dmesg?
<apinheiro[m]1>
<itoral "apinheiro: those regressions you"> the device, I stopped to do full runs on the sim a long time ago (as now is slower, we needed to skip sync tests, and it is "less real")
<apinheiro[m]1>
<itoral "if you did them on a device, did"> checked, but nothing relevant
abergmeier has joined #videocore
<itoral>
apinheiro[m]1: I think maybe it would be a good idea to do a sim run with that patch to see if the sim complains about something, since the regressed tests don't do any buffer copies it looks like some other test is what is causing them to fail
<apinheiro[m]1>
itoral: ok, I will set a full run on the sim
<apinheiro[m]1>
although I think that the sanitized vk-default.txt is somewhat abandoned these deays
<apinheiro[m]1>
*days
<apinheiro[m]1>
itoral: btw, with all this cache work, I have been also thinking on that idea of reusing the same bo for the assembly of all the shaders
<apinheiro[m]1>
with this work, that will be easier
<apinheiro[m]1>
the tricky part would be when to freed the shared bo
<itoral>
I suppose we would need to have the pipelines refcount the BOs
<apinheiro[m]1>
because we can't just free it with the pipeline
<itoral>
and we would free on the last unref
<apinheiro[m]1>
well, yes that it was I was thinking
<apinheiro[m]1>
but as this is the only case that would need that, seems somewhat an overkill to add ref_count directly on v3dv_bo
<apinheiro[m]1>
so perhaps a wrapper?
<itoral>
maybe...
<apinheiro[m]1>
the other solution is that right now on the pipeline_cache we have a "cache_entry"
<apinheiro[m]1>
that is basically a struct that contains all the data we are caching from the pipeline
<apinheiro[m]1>
in order to understand: don't know if you remember on the review, that my first approach saved the descriptor maps on the individual variants
<apinheiro[m]1>
now it is saved on that "cache_entry", so only once
<itoral>
aha
<apinheiro[m]1>
but right now that cache_entry is a implementation detail of pipeline_cache
<apinheiro[m]1>
and it already have ref_count
<apinheiro[m]1>
so an alternative, would be to make it public, perhaps rename it to something like "pipeline_cacheable_data"
<apinheiro[m]1>
and include the common bo there
<itoral>
what if we have disabled the pipeline cache?
<apinheiro[m]1>
that is the main reason to rename it from "cache_entry" to something else
<apinheiro[m]1>
the pipeline would need to maintain a reference to it
<apinheiro[m]1>
the only difference when we disable the pipeline cache
<itoral>
ok, I get it
<apinheiro[m]1>
is that this info will not be upload to the cache
<apinheiro[m]1>
*uploaded
<apinheiro[m]1>
so with cache: pipeline has a reference, cache another
<apinheiro[m]1>
without cache: pipeline has a reference
<itoral>
right
<apinheiro[m]1>
well in fact
<apinheiro[m]1>
so with cache: pipeline has a reference, any cache (could be more than one) another
<apinheiro[m]1>
the advantage of this is that we avoid adding ref_count support to a new struct
<itoral>
yes, I understand, I think that could work
<apinheiro[m]1>
the disadvantage is that now the pipeline needs to handle this struct
<apinheiro[m]1>
right now it is just a cache thing
<itoral>
but all the info in that entry is still in the pipeline anyway no?
<itoral>
my understanding is that by having this struct in the pipeline
<itoral>
we may be able to drop some of the pipeline fields
<itoral>
that would be a duplicate of this info
<itoral>
no?
<apinheiro[m]1>
well, not drop, but move
<itoral>
not sure I follow, if you have entry.x and pipeline.x
<apinheiro[m]1>
well, yes, but now you have pipeline.x
<itoral>
then if you have pipeline.entry you no longer need pipeline.c
<itoral>
then if you have pipeline.entry you no longer need pipeline.x
<apinheiro[m]1>
and then you will have pipeline.entry.x
<itoral>
exactly
<itoral>
so you can drop the field from pipeline
<apinheiro[m]1>
so move in the sense that instead of direct fields, will be
<itoral>
which was my point
<apinheiro[m]1>
hmm
<apinheiro[m]1>
subfields?
<itoral>
in any case, if we can do that, I think that solution would work fine