stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
stikonas has quit [Remote host closed the connection]
rokquarry has quit [Quit: Leaving]
buzzmarshall has joined #panfrost
NeuroScr has quit [Quit: NeuroScr]
nerdboy has quit [Ping timeout: 240 seconds]
macc24 has joined #panfrost
davidlt has joined #panfrost
icecream95 has joined #panfrost
buzzmarshall has quit [Remote host closed the connection]
davidlt has quit [Ping timeout: 272 seconds]
davidlt has joined #panfrost
horsewater has joined #panfrost
macc24 has quit [Ping timeout: 246 seconds]
<icecream95>
bbrezillon: Starting any GL application with Mesa 794c239a990 or later and using a lot of RAM (e.g. dd from /dev/zero to a tmpfs) will trigger it
<icecream95>
(this is not related to the change in Mesa, I've had it about once or twice a month since at least 5.4)
macc24 has joined #panfrost
<bbrezillon>
icecream95: that's not the same problem
<bbrezillon>
looks like this one is happening when you try to re-import an already imported dmabuf, or some missing unmap calls in the gem_close/fd_close path
rcf has quit [Quit: WeeChat 2.7]
rcf has joined #panfrost
icecream95 has quit [Quit: leaving]
nlhowell has joined #panfrost
stikonas has joined #panfrost
<bbrezillon>
robmur01, robher: are you sure we should call shmem_truncate_range(0, -1) when purging a BO
horsewater is now known as mixfix41
<bbrezillon>
shouldn't we pass the range attached to GEM object being purged instead?
<bbrezillon>
there's a file per obj, so we really want to truncate the whole file
nlhowell has quit [Ping timeout: 260 seconds]
raster has quit [Quit: Gettin' stinky!]
<alyssa>
icecream95: bbrezillon: I can revert the mesa change (it was a CPU overhead optimization), but I don't see how that could trigger kernel issues.
<bbrezillon>
alyssa: no, it's really a kernel bug
<bbrezillon>
I'm chasing it right now
<alyssa>
OK
<bbrezillon>
I think it has to do with BOs flagged growable
<alyssa>
Does it make sense to predicate that commit on new kernel versions once it's fixed, though?
<alyssa>
Growable shouldn't be in the cache afaik.
<alyssa>
actually did we change that?
<bbrezillon>
hm, I think they are
<alyssa>
Better point is growable should never be CPU mapped
<alyssa>
(IIRC the DDK only CPU maps them for debug)
<alyssa>
bbrezillon: I'm trying to think more long-term what to do, since AFAIK users tend to get new mesa earlier than new kernel (or at least I do).
<alyssa>
So even if the bug is fixed in 5.8, I mean, how many users do we have that are still on 5.2?
<bbrezillon>
well, fixes are supposed to be backported
<alyssa>
True
<bbrezillon>
and distros are expected to update their kernels
<alyssa>
but do users?
<bbrezillon>
so I wouldn't worry about that
<alyssa>
Personally I'm stuck on an older version since the new kernel in debian has major devfreq regressions as discussd..
<bbrezillon>
dunno, but then it's their problems, no?
<bbrezillon>
have you mentioned that to mmind00 ?
<bbrezillon>
it's on kevin, right?
<alyssa>
Well, you just pinged em :)
buzzmarshall has joined #panfrost
<bbrezillon>
alyssa: ok, so forbidding MADV on heaps fixes the BUG reported by icecream95, but now I hit a NULL pointer dereference :'-(
<bbrezillon>
my bad, it's caused by my own traces :)
<alyssa>
bbrezillon: We should probably do that anyway tbh.
<mmind00>
"new kernel in debian has major devfreq regressions as discussd" ... what does that mean? Aka is that device-specific of general?
<bbrezillon>
alyssa: nope, still have unevictable pages after that, so it's something else
<bbrezillon>
and drm_gem_get_pages() calls mapping_set_unevictable() too
<bbrezillon>
mmind00: I think it's kevin specific
<bbrezillon>
(or the rk3399 variant used on this chromebook)
<alyssa>
^^ yeah
<alyssa>
bbrezillon: does madv just rely on things being munmapped?
<urjaman>
afaik it could affect any panfrost using thingy, but actually only happens when the default voltage is too low for max speed
<urjaman>
but kevin and also my veyron speedy
<urjaman>
(and i assume other veyrons too... tho also it's like overclocking (or undervolting) so device specific as to whether bad stuff will happen)
<mmind00>
bbrezillon: is there a problem description somewhere ... because at least the op1 operating points didn't change since 2017 - so I'm not really sure what should've broken
<urjaman>
iirc it just doesnt change the voltage
<urjaman>
ask robmur01 ? i think he knows more...
<bbrezillon>
alyssa: if it is, we should refuse the MADV(DONT_NEED) instead of adding it to the purgeable pool
<alyssa>
bbrezillon: If it's madv(..) and the kernel wants to claim it back, the kernel should go ahead and munmap it I think?
nlhowell has joined #panfrost
<bbrezillon>
alyssa: that's what I'm trying to figure out
<bbrezillon>
it does unmap()
<bbrezillon>
but apparently that's not enough
<alyssa>
alright :)
<bbrezillon>
*unmap it
<bbrezillon>
I mean, drm_vma_node_unmap() is called, but page->_mapcount) is not decremented as a result
nlhowell has quit [Quit: WeeChat 2.8]
<alyssa>
Uh oh.
nlhowell has joined #panfrost
macc24 has quit [Quit: WeeChat 2.8]
Stary- is now known as Stary
<alyssa>
Gotta love me some Midgard.
<alyssa>
glBlendFunc(GL_DST_COLOR, GL_SRC_COLOR)
<alyssa>
(with FUNC_ADD)
<alyssa>
This computes the final colour as:
<alyssa>
(dst)(src) + (src)(dst)
<alyssa>
Mulitplication is commutative so that's equal to just 2(src)(dst)
<alyssa>
(src)(dst) we can easily compute with fixed-function, but the above split we can't since there's no dominant factor, and the latter we can't since there's a 2
<alyssa>
so all because of that 2, we need to trigger a blend shader.
<alyssa>
Anyway, for RGB565 the blend shader we emit on T860 is 14 cycles. We can do a lot better.
<alyssa>
{ This comes up in t-rex, for those following along. }
<alyssa>
Using a native unpack gets us down to 12 cycles
NeuroScr has joined #panfrost
<alyssa>
Fusing in more things into the convert gets us to 11