alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - https://gitlab.freedesktop.org/panfrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
<alyssa_> tomeu: kmscube doesn't work w/ DRM kernel
<alyssa_> Wondering why, since I know others have it working
<alyssa_> Getting "drmModeGetResources failed: Operation not supported"
<alyssa_> Oh, I had to force a --device in
<alyssa_> Maybe
<alyssa_> Yeah, just had to force a --device
BenG83 has quit [Quit: Leaving]
<alyssa_> ...How do point sprites work anyway
<alyssa_> OIC. That's terrible :D
<HdkR> Point sprites are great
<alyssa_> I guess I need to implement gl_PointCoord first
_whitelogger has joined #panfrost
shenghaoyang has joined #panfrost
shenghaoyang has quit [Remote host closed the connection]
<alyssa_> gl_PointCoord is really, really wacky
<alyssa_> First thing is that it automaically implies gl_PointSize, which is fine, but adds more complexity to the analysis since PointSize by itself is wacky
<HdkR> alyssa_: Is gl_PointSize not calculated similar to gl_Position?
<HdkR> er
<alyssa_> HdkR: Hence the wacky :p
<alyssa_> FragCoord?
<HdkR> FragCoord on the frag side I guess :p
<HdkR> and...PointCoord instead of PointSize
<HdkR> FragCoord == Screen space, PointCoord == uv space across the face of the point
<alyssa_> If only
<HdkR> Does it lower to a varying or something?
<alyssa_> Sorta
<alyssa_> Pseudovarying
<alyssa_> *blinks*
<alyssa_> So it does some arithmetic between a driver-supplied uniform and a hardware-supplied varying
<alyssa_> It's hard to disentangle those two..
<alyssa_> ...except the uniform is being set to identity, so gl_PointCoord is just equal to the varying in practice
<alyssa_> What am I missing here
<alyssa_> Identity... I can use some linear algebra to attack this, I think
<alyssa_> Indeed! It's not a vec4 uniform, it's a mat2!
<HdkR> woo
<alyssa_> Gosh, who'da thunk I'd actually use something from my math classes in real life? (^:
<alyssa_> ....Still not obvious how to separate the two
<alyssa_> Oh, we do know that U cannot depend on PointSize, since that's a vertex shader output and we're computing the matrix on the driver-side
<alyssa_> Obviously it can't depend on xf either
<alyssa_> (Since varying)
<alyssa_> Oh, and it can't even depend on xu so...
<alyssa_> Interesting
<alyssa_> The hardware has to handle the whole shebang
<alyssa_> Logically, the fixed-function hardware probably is doing the entire PointCoord calculation per the spec
<alyssa_> So this isn't a coefficient matrix so much as a linear transformation
<alyssa_> Hey, do other APIs invert the coordinate space?
* alyssa_ grabs Vulkan spec
<HdkR> lol
<alyssa_> It *does*!!
<alyssa_> ...not?
<alyssa_> The formulas look different from OpenGL but the description is the same
<alyssa_> Oh wait, but isn't the framebuffer in OpenGL flipped or something?
<HdkR> I think GL and Vulkan should match?
<HdkR> It's D3D that ends up being different
<alyssa_> D3D it is then
<alyssa_> They advertise support fot D3D so
<alyssa_> Case closed.
<alyssa_> Findings: the "vec4 magic uniform" is actually a 2x2 matrix representing a linear transformation on the output of the fixed-function hardware pipeline. The hardware is natively Khronos, so for OpenGL, the identity matrix is used. But because of the offline shader compiler support (or maybe other reasons), they can't know the API ahead-of-time, so they *always* do the transformation such that they can set the uniform later to flip it if
<alyssa_> necessary.
<alyssa_> For us, that's strictly irrelevant since we're not writing a D3D driver :P
<alyssa_> (And if you used the nine st tracker or whatever, that would flip it for us)
<alyssa_> [ Well, we have to do our own transformation for Galliumish reasons but I digress ]
<alyssa_> Suddenly gl_PointCoord is feeling a lot less magical
<HdkR> There is a GL extension to flip it as well
<alyssa_> HdkR: There you go, then
<HdkR> Wonder if there will be a vulkan extension to flip it
shenghaoyang has joined #panfrost
davidlt has quit [Ping timeout: 245 seconds]
<alyssa_> Long story short, flip the lowest bit of unknown1 in the shader descriptor, add a RG16F fragment varying with the magic source (0x60 | LINEAR) and other fields zero, and load that varying in the fragment shader for GLES-style point coords
<HdkR> neat
chewitt has joined #panfrost
shenghaoyang has quit [Remote host closed the connection]
davidlt has joined #panfrost
<tomeu> we have a healthy backlog this morning
davidlt has quit [Ping timeout: 246 seconds]
<tomeu> what's this list people are talking about?
<tomeu> I maybe missing some backlog, first I found this morning is "frequency scaling? (I'm not sure if it's done in software on midgard/bifrost)"
chewitt has quit [Quit: Zzz..]
chewitt has joined #panfrost
<chewitt> @alyssa_ current mesa master + kbase appears to have regressed in the last week with changes
<chewitt> seeing some crashes when navigating around Kodi GUI
<chewitt> seeing lots of triangles
<chewitt> on the +ve memory management seems to be more stable
<chewitt> (crashes aside)
chrisf has quit [Quit: ZNC - 1.6.0 - http://znc.in]
griffinp has joined #panfrost
padovan has quit [Remote host closed the connection]
<robher> tomeu: todo file for panfrost (driver): https://www.irccloud.com/pastebin/fxGC36Eb/
<chewitt> robher: tomeu: do either of you have T820 hardware for testing?
<chewitt> i.e. Amlogic S912 board/box
<tomeu> don't know, tbh
<tomeu> was that on a exynos?
<chewitt> ISTR there was a T830 on an Exynos board
<chewitt> but I forget which
<chewitt> T820 is used in lots of Android TV box devices that I care about
<tomeu> I think the only other board with mali is a odroid-u2
<tomeu> that's probably too old to have a t820 I guess
<chewitt> ^ T830
<tomeu> yeah, at that point I didn't care about exynos any more :)
<tomeu> chewitt: why do you ask?
<chewitt> DRM driver is not currently working .. kbase works
<chewitt> @narmstrong started to investigate
<tomeu> ah cool, it's in good hands then :)
<chewitt> I also have spare S912 boards that need a good home .. I'd be happy to post some out
<chewitt> vendor had a "lost in translation" moment and sent me 10x of everything instead of 1x
<chewitt> i'm keen to promote that something other than Rockchip hardware exists :)
<tomeu> chewitt: are those dev boards with serial, etc?
<chewitt> yup
<chewitt> Khadas VIM2 basic
<tomeu> could be a good idea to add them to some LAVA lab that is part of kernelci
<tomeu> so I guess baylibre's or collabora's
<tomeu> will be handy once we start doing CI
<chewitt> I think the baylibre lab has them already
<tomeu> yeah, but when we run deqp, the workload will be much much higher
<tomeu> so we'll need to have several instances of each to distribute the load
<tomeu> I guess atm they do little more than booting kernels :)
<chewitt> I'm happy to post stuff, at the moment they are just sat in a box in a corner
<tomeu> I'm a bit unsure of why it wouldn't work on the T820, seems quite similar to the T860 from kbase's POV
<tomeu> wonder if it isn't due to the reset quirk in that amlogic SoC
<tomeu> but narmstrong knows about it
<tomeu> robher: regarding the TODO, guess it would make sense for me to take the first three
<tomeu> I think bbrezillon will take care of the last one (perf counters)
<tomeu> regarding testing on midgard variants, I need to get this working on the RK3288 (so T760)
<tomeu> out of the box is raising TRANSLATION_FAULT_LEVEL1 right away
<bbrezillon> tomeu: yep
<bbrezillon> already started working on this task
<tomeu> ok, then it's left bitfrost, compute and mmu stuff
<robher> chewitt: no, only have a rock960.
<tomeu> robher: guess if we get the igt submit test working (with a trivial clear), that could be enough to validate the bitfrost parts
<tomeu> no need to submit shaders or more complex cmdstream as the kernel doesn't care about those
<robher> And perhaps bi frost too. ;)
<tomeu> damn
<tomeu> I don't think I will fix that ever
<robher> the struggle is real
<robher> we can't all be avengers geeks
<tomeu> robher: what parts of compute do you think affect the kernel?
<tomeu> for gles compute I can only see passing a REQ so the kernel knows in which slot to put a job in
<robher> tomeu: we need a compute only req flag at a minimum
<robher> jinx
<tomeu> but wonder if the memory features of opencl and vulkan won't require uapi changes
<robher> I think 'compute only' may be just for CL contexts
<tomeu> robclark: do you know what do we need to get right at this point so that once we do more advanced compute, we don't have to change uapi??
<robher> tomeu: speaking of uapi, did you airlied's comments on lima review?
<robher> tomeu: basically, we should do userspace GPU VA management.
<robclark> tomeu, the one small thing I needed to add to uapi for clover was a way to get a bo's iova (but nothing was needed for gl compute shaders)
<tomeu> robclark: why was that needed?
<robclark> because of how pointers to global memory are passed to kernel's..
<tomeu> robher: so mesa gets a GEM bo big as the whole AS that creates BOs from, and then the kernel allocates memory in page faults?
<robclark> clover builds up the input buffer (ie. function parameters to kernel function) and encodes pointer addresses in that
<tomeu> robclark: ah, cool, thanks
<robclark> it was a pretty trivial uabi bump.. just new value to GEM_INFO ioctl
<robclark> (at any rate, I'd say leaving room to extend uabi later is more important than getting everything from the beginning)
<robher> tomeu: no, userspace maintains the address space map and tells the kernel what address a BO is at.
<tomeu> we have that one already
<tomeu> good point, though I guess it takes experience to know what to prepare for
<robher> tomeu: go read the lima driver thread...
<tomeu> that was my understanding from reading it, but I'm not sure how accurate it is
<narmstrong> robher: i think your TODO list should either be pushed on the panfrost/linux wiki, on an gitlab issues...
<tomeu> narmstrong: for mesa, you can use mainline
<robher> narmstrong: my plan was a TODO file in drivers/gpu/drm/panfrost/ like other drivers.
<narmstrong> tomeu: you mean mesa master ?
<tomeu> narmstrong: yes
<narmstrong> robher: it's also good
<narmstrong> tomeu: ok then, I'll retry with it
<tomeu> narmstrong: but, don't you need similar changes as for kbase? for this amlogic soc
<narmstrong> tomeu: I need some changes, but I need test to check which changes
<narmstrong> tomeu: seems the SOFT_RESET is broken, and we need to use the in-soc reset lines, and seems we need the PWR_KEY and PWR_OVERRIDE1 magic values
<narmstrong> no idea why, and we may never know why
<tomeu> cool, so I think there's a good chance things will just work afterwards
<narmstrong> seems the automatic corer power management is broken in HW
<narmstrong> hopefully...
<narmstrong> I was stuck with JOB_CONFIG_FAULT
shenghaoyang has joined #panfrost
<alyssa_> chewitt: What do you mean by "+ve"?
<chewitt> previously if you left the system-info screen open in Kodi the free mem counts down until crash
<chewitt> this no longer appears to happen
<chewitt> that's not necessarily due to changes in the last week, but in the last month
<alyssa_> I'll take a look I guess
<chewitt> since last weekend (last time I caught-up on patches) the GUI started showing noticeable artefacts
<chewitt> "zaggy triangles" in the words of my daughter
<alyssa_> I can reproduce this.
<chewitt> two other things to ensure a broader use-case in Kodi
<alyssa_> Guess I'll have to be the one to bisect..
<chewitt> under settings > skin settings > skin > configure skin .. there's an option to enable/disable "slide animations"
<chewitt> with them turned off .. I see more glitches
* alyssa_ tries something
<chewitt> also, do you have a fake database loaded so the library views have thumbnails and artwork?
<alyssa_> Mmhmm
<alyssa_> chewitt: Bug fixed
<alyssa_> Lemme submit the patch to the list so I don't forget about it
<chewitt> that's a bit quick :)
<tomeu> robher: do you have any ideas on why we would get TRANSLATION_FAULT_LEVEL1 on memory accesses on T760?
<alyssa_> chewitt: Patch on the list
* alyssa_ class
<chewitt> @alyssa_ https://www.dropbox.com/s/llciqryeqqk4t70/Test%20video%20library.zip?dl=0 has fake "files" that can be used to populate the library views
<robher> tomeu: cpu address != bus address ?
<robher> cpu phys addr that is.
<robher> Or mali is behind another iommu?
<robher> but this platform works with kbase?
<tomeu> robher: my understanding is that alyssa_ has been testing mesa on kbase on a similar machine
<tomeu> (this is a chromebook veyron)
<tomeu> the mmu seems to have been programmed correctly
<tomeu> oh wait, we are lacking a bit of error handling there
<tomeu> maybe we just don't know about it
<chewitt> @alyssa_ that fixes the glitches
<chewitt> seeing lots of crashing though .. here's systemd journal http://ix.io/1DAI
<tomeu> hmm, no, something should have been logged if there was an error programming the mmu
<tomeu> chewitt: aren't all crashes due to OOM?
<chewitt> historically i'd be able to use the GUI for some time before an OOM caused a restart
<chewitt> now it's being triggered by something specific .. i.e. can be 5 mins after starting, can be 10 secs after starting
<robher> tomeu: I don't see any issues differences that would matter...
<tomeu> ok, will run with kbase and log the register writes
<tomeu> some other day, though :)
<narmstrong> tomeu: what is the simplest way to log kbase's register writes ?
<tomeu> narmstrong: don't know yet how to do it, but maybe there's a debugfs entry for that?
<narmstrong> tomeu: no idea, I thought you knew
chrisf has joined #panfrost
<narmstrong> I remember someone talking about this... if you find out, tell me !
<tomeu> yeah, won't be today though :/
<tomeu> but if you find out, tell me! :)
<narmstrong> sure
shenghaoyang has quit [Remote host closed the connection]
belgin has joined #panfrost
<narmstrong> seems you need to echo 1 > regs_history_enabled in debuhgfs
Elpaulo1 has joined #panfrost
Elpaulo has quit [Ping timeout: 245 seconds]
Elpaulo1 is now known as Elpaulo
shenghaoyang has joined #panfrost
shenghaoyang has quit [Ping timeout: 252 seconds]
shenghaoyang has joined #panfrost
<Lyude> chewitt: I can provide hw for testing
<Lyude> if yall need
<Lyude> (can put serial/ssh access on my server)
<Lyude> (also @ tomeu )
belgin has quit [Quit: Leaving]
<narmstrong> just realized while tracing, mali_kbase calls pm_soft_reset between each freaking job submits, thus resetting the whole freaking GPU everytime
<narmstrong> pm_soft_reset is for meson
<narmstrong> kbase_pm_do_reset calls it
LinguinePenguiny has joined #panfrost
QwertyChouskie has joined #panfrost
<chewitt> that sounds, err, less than ideal
<chewitt> nice find
QwertyChouskie has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
<alyssa_> robher: tomeu: The RK3288 (my t760 test machine) is a 32-bit system... which corresponds to some divergent code paths in the kernel
<alyssa_> The blob has some fancy tricks to conserve address space.. right now for kbase on Veyron, we're just always setting the SAME_VA flag for instance (or otherwise we get errors similar to the above since we're not handling divergent addresses in userspace, since on aarch64+kbase, they never bother with that since 64-bit is plenty)
<alyssa_> chewitt: Looks like alll the crashes are OOM as tomeu pointed out..
<alyssa_> narmstrong: What in the world :p
<alyssa_> Sounds like a seriously borked errata workaround...?
<alyssa_> robher: But yeah, re "cpu address != bus address ?" for 32-bit that is indeed the case if SAME_VA isn't set, I think...?
<alyssa_> There are way too many address spaces to keep track of tbh
<chewitt> ^ this sounds like it's not worth me build/testing 32-bit images again yet ?
<alyssa_> chewitt: Not yet, no
<alyssa_> robher: kbase has the concept of memory zones, including a dedicated SAME_VA zone... I'm trying to understand what that corresponds to on the wire
<alyssa_> It may be necessary to do a reg dump of kbase on RK3288 or something..
stikonas has joined #panfrost
stikonas has quit [Remote host closed the connection]
shenghaoyang has quit [Remote host closed the connection]
<robher> alyssa_: I'm talking about something else. A cpu's view of memory (aka physical address) can be different than mali's (or any DMA master) physical addresses. Usually, it would just be some fixed offset.
<hanetzer> alyssa_: btw, how are you booting on kevin? (I assume you're still using that)
QwertyChouskie has joined #panfrost
QwertyChouskie has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
LinguinePenguiny has quit [Ping timeout: 245 seconds]
<alyssa_> robher: Oh, I see.... good luck ;)
<alyssa_> hanetzer: depthcharge
QwertyChouskie has joined #panfrost
stikonas has joined #panfrost
* alyssa_ grumbles realising how much refactor is needed of this bad code
Kwiboo has quit [Quit: .]
Kwiboo has joined #panfrost
<hanetzer> alyssa_: nice. did you do the gbb_flags thing to make the dev-mode wait shorter and make it not beep?
cwabbott has quit [Quit: cwabbott]
cwabbott has joined #panfrost
cwabbott has quit [Client Quit]
cwabbott has joined #panfrost
<alyssa_> hanetzer: No?