<anarsoul>
looks like gpu cache isn't flushed somewhere?
<anarsoul>
or CPU cache isn't invalidated?
<enunes>
anarsoul: yeah I expect some missing flush or missing synchronization somewhere
<enunes>
there is a flush right before the reads but apparently it is only made on writes
<enunes>
or something like that
<anarsoul>
it makes sense to look into vendor driver
<anarsoul>
it should be somewhere here
<enunes>
just, it's already taken weeks of debugging to reach the point where something I do changes the behavior
<enunes>
to rule out things like unitialized variables or shader bugs
ninolein has joined #lima
kaspter has quit [Read error: Connection reset by peer]
<enunes>
anarsoul: would a missing flush make sense considering that just waiting more seems to solve it?
<anarsoul>
well, if picture produced by shader runner is correct that it's fine
<enunes>
I wonder how the flush would happen in that additional time
<enunes>
that's why my initial guess was about synchronization, maybe there is something else we have to wait for in the backend of lima_bo_wait
<anarsoul>
enunes: then it's probably missing invalidate on CPU side
<anarsoul>
basically CPU has old value in its cache
<anarsoul>
with some time it's evicted so it actually does memory read
<enunes>
ok, do you have a hint on how would I mark that invalid so it fetches from memory again?
<anarsoul>
it should be done somewhere in kernel driver
<anarsoul>
let me take a look...
<anarsoul>
hm, it doesn't call it directly so it must be done indirectly
<enunes>
I'm trying with running lima_flush unconditionally before the lima_bo_wait
<enunes>
hmm not sure if that makes sense, oh well
jrmuizel has joined #lima
chewitt has quit [Quit: Adios!]
<anarsoul>
enunes: you probably can't fix it in mesa
<anarsoul>
I believe it should be done somewhere in kernel driver
<enunes>
makes sense
adjtm_ has quit [Quit: Leaving]
<enunes>
I wonder if the timeout value that we provide is valid
<enunes>
we provide an unsigned infinite, but that gets translated over through signed and unsigned, and ends being "timeout jiffies: 0", which in drm_gem_reservation_object_wait "timeout: timeout value in jiffies or zero to return immediately"
<anarsoul>
hm
<anarsoul>
enunes: trace it? maybe we're not waiting after all...
<enunes>
anarsoul: yeah I had to drop but back now, I think we are not waiting at all, I'll try setting some real timeout there
<anarsoul>
btw I haven't seen flip-flops in piglit as of now
<anarsoul>
my guess is that added DVFS support somehow contributes to that
<anarsoul>
(I have thermal driver in my tree and thus I've enabled DVFS on pine64)
<anarsoul>
cpu freq now goes up to 1.152GHz, so probably it helps to thrash caches faster :)
kaspter has joined #lima
jrmuizel has quit [Remote host closed the connection]
<enunes>
anarsoul: 7 runs so far and no unstable results, if this simple change fixes it I'll be very happy
<anarsoul>
:)
<anarsoul>
how many runs did you need to reproduce it without the change?
<rellla>
enunes: great debug work btw
<rellla>
anarsoul: i finally got my H5 set up :p
<anarsoul>
rellla: cool
<enunes>
with my current run, sometimes just 2, in the example I pasted in the gitlab issue it seems to be 6/20 runs fail, so ~1/3?
<anarsoul>
rellla: note that it's mali450, not 400 :)
<rellla>
sure
<anarsoul>
but it's crippled mali450
<anarsoul>
IIRC it has half of PLB block size of regular mali450
<anarsoul>
still should be faster than mali400 though
<rellla>
but 450 should deliver the same results as 400, right?
<anarsoul>
450 is faster
<anarsoul>
and IIRC easier to program
<enunes>
rellla: good that someone uses 450, I for one don't use it at all :)
<rellla>
now i'm setting up an H3 as a second 400 device... maybe i will use my A10 for binary blob...
<rellla>
enunes: with "set up" i mean "booting" for now :)
<anarsoul>
rellla: you should get yourself a pine64 :)
<enunes>
I have an odroid c2 but it runs some stuff, if I manage to replace it something else I can include it in my automated testing
<anarsoul>
oh, I actually have a device with mali450
<anarsoul>
it's rock64
<enunes>
anarsoul: my pinebook is relatively unusable because 1) the display seems to come on sometimes but sometimes not, seems to be a backlight issue only though; and 2) the sdcard driver seems buggy and often corrupts my data, I have to use an sdcard through a usb adapter and that works...
<enunes>
do you also see those issues?
<anarsoul>
enunes: I'm using eMMC, haven't tried SD in a while
<anarsoul>
enunes: try to reseat connector of expansion board?
<anarsoul>
it's quite long
<enunes>
I have your patches from that tree but I apply them to a Fedora kernel... which should be pretty much mainline
<anarsoul>
enunes: well, display is pretty stable for me
<anarsoul>
is it 11" pinebook?
<enunes>
it's a 14" with the normal res (not hd?) display
<anarsoul>
make sure that you're using latest patches for anx6345 driver
<anarsoul>
however if u-boot fails to turn on display that's not it
<rellla>
anarsoul: indeed i should go for a pine64lts as i have no A64 device yet
<enunes>
anarsoul: uboot also fails to turn the display on sometimes, this seems to have improved with 2019.07
<rellla>
i wonder if i should give panfrost a try on my H6 ... though i fired it with libreelec yesterday...
<enunes>
are there patches for anx6345 which are not in your tree?
<anarsoul>
enunes: no, but I fixed up some of them ~3-6 months ago
<anarsoul>
so some patches may differ
<rellla>
anarsoul: do i always need ATF to build arm64-uboot or did i miss sth?
<anarsoul>
yes, you always need atf
<rellla>
ok, then everything is good.
jrmuizel has quit [Remote host closed the connection]
jrmuizel has joined #lima
kaspter has quit [Read error: Connection reset by peer]
jrmuizel has quit [Remote host closed the connection]
kaspter has joined #lima
<megi>
rellla: panfrost doesn't work very well on H6, yet
<megi>
it can run kmscube ;)
<megi>
it can run X11/i3wm desktop
<megi>
but it produces strange artifacts
<enunes>
anarsoul: rellla: I likely won't be around tomorrow even if the timeout MR gets reviews, also I think feedback from yuq would be helpful, so I'll just leave an endless loop of full piglit runs with that...
<megi>
and sometimes it locks up, and runs out of memory, especially with something more complicated like a web browser
<anarsoul>
enunes: sounds good
jrmuizel has joined #lima
jrmuizel has quit [Read error: Connection reset by peer]
jrmuizel has joined #lima
jrmuizel has quit [Read error: Connection reset by peer]
jrmuizel has joined #lima
<rellla>
anarsoul: btw i did some wiki work. that changes are what i meant yesterday :)