alyssa changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://freenode.irclog.whitequark.org/panfrost - <daniels> avoiding X is a huge feature
icecream95 has joined #panfrost
nerdboy has quit [Excess Flood]
* macc24 got gnome running on duet
<macc24> aside from fact that it's super slow, it's fine
<HdkR> woo
alyssa has quit [Remote host closed the connection]
<macc24> it turns out that gdm dislikes /etc/default/locales being empty
icecream95 has quit [Ping timeout: 260 seconds]
<HdkR> Who needs a locale other than "C"? :P
<macc24> gdm needs it!
<macc24> for whatever reason
* macc24 thinks that gnome is cursed permamently
* HdkR looks at gnome-shell
nerdboy has joined #panfrost
<HdkR> yep
<macc24> on the other hand, phosh runs pretty well
<macc24> and is near edge of usability aside from fact that there are still some `issues
tomboy64 has quit [Remote host closed the connection]
tomboy64 has joined #panfrost
justin3 has quit [Ping timeout: 265 seconds]
icecream95 has joined #panfrost
stikonas has quit [Remote host closed the connection]
popolon has quit [Quit: WeeChat 3.0]
<kinkinkijkin> writing a hacky attempt at fixing this
<kinkinkijkin> just changing a magic constant to see if its fixed by changing that
<kinkinkijkin> and some other changes im testing
<kinkinkijkin> but the most important test is that there magic number
robmur01 has quit [Read error: Connection reset by peer]
robmur01 has joined #panfrost
<chewitt> kinkinkijkin share the patch once you've hacked it, I have an XU4 image for my distro that deadlocks trying to start kodi .. i'm always keen in t6xx experiments
karolherbst has quit [Remote host closed the connection]
karolherbst has joined #panfrost
kaspter has joined #panfrost
vstehle has quit [Ping timeout: 256 seconds]
bbrezillon has quit [Ping timeout: 272 seconds]
megi has joined #panfrost
kaspter has quit [Ping timeout: 256 seconds]
kaspter has joined #panfrost
archetech has quit [Quit: Konversation terminated!]
kaspter has quit [Ping timeout: 265 seconds]
kaspter has joined #panfrost
<kinkinkijkin> of course i am getting an issue with permissions
nerdboy has quit [Ping timeout: 256 seconds]
nerdboy has joined #panfrost
archetech has joined #panfrost
megi has quit [Ping timeout: 260 seconds]
megi has joined #panfrost
<kinkinkijkin> okay, fix confirmed bootable but i havent gotten panfrost to work at all on a working kernel yet so im installing 5.10
<kinkinkijkin> (its only in mesa side, didnt cause any issues loading on a crashing kernel ver)
kaspter has quit [Ping timeout: 265 seconds]
kaspter has joined #panfrost
<kinkinkijkin> okay, i have it booting armbian with the new kernel
<kinkinkijkin> okay, i need to reconfigure that kernel
<archetech> N2 is what needs attn. G52
<kinkinkijkin> okay, i can successfully hang the system very reproducibly
<kinkinkijkin> downside: reading logs is one of the things that hangs the system
<HdkR> archetech: Random morale boost?
<archetech> you need to back off
<kinkinkijkin> now what makes you say that, archetech?
<archetech> was at this HdkR guy who likes to harrras me
<kinkinkijkin> don't make demands in dev chats then
<archetech> \demand?
<archetech> how was that a demand
<kinkinkijkin> and he wasnt harassing, he was checking if it was a demand or not
<kinkinkijkin> you just entered and told us what apparently needs attention
<archetech> its a comment
<archetech> and an opinion
<kinkinkijkin> yes, and a very useless, brash comment
<HdkR> harassment claims are something that need to be taken seriously. You made an unrelated comment without any context.
<HdkR> Is there something specific about G52 that needs attention? Bifrost work is coming up already with G3x and G5x being the targets of choice already.
<HdkR> I recommend filing an issue at https://gitlab.freedesktop.org/mesa/mesa/-/issues for specific features or reproducible problems for developers to triage successfully.
<archetech> so this irc chan isnt for users of panfrost and errors they get?
<archetech> just devs ?
<kinkinkijkin> give the error
<kinkinkijkin> not "fix this device"
<HdkR> This channel can be used for discussion yes. It just needs context rather than a comment drop without context
<archetech> putting words in my mouth now eh
<archetech> context is obvious I said G52
<HdkR> G52 is in active development. There of course will be issues in the driver. What problem are you encountering this day?
<archetech> G52 also hangs freezes is also in context of what kinkinkijkin was seeing
<kinkinkijkin> im testing fixes for a specific device.
<archetech> thats my problem
<archetech> lots of n2 owners problem
<kinkinkijkin> so you've just come to tell us to divert all resources to a device that's already being worked on very heavily
<kinkinkijkin> driver work takes a long time, and not everyone working on this driver has incredibly much time, or an n2 for that matter
<archetech> wow you're defensive
<kinkinkijkin> and keep in mind a good handful of the developers involved with this driver are working for free or on individual sponsorship, though a lot of the work is done by folk employed to
<archetech> yes Im aware of what dev work is like
<kinkinkijkin> like I'm working on fixes right now with individual sponsorship, do not own an n2, and have 3 other projects im split between, two of which generated my individual sponsorship
<archetech> please continue on
<archetech> no need to educate me. for a simple g52 comment
<HdkR> archetech: dmesg or kernel logs should give information about what the problem is. Make sure to open a bug report with this information to be recorded for inactive developers to see when they come online
<archetech> right. will do
<kinkinkijkin> thanks, a bug report will do better yeah
<archetech> I do debugging for qt5 kde frameworks and plasme
<archetech> so far its not fun to run it on my N2+
<HdkR> Make sure to have reproduction steps in the issue of course, otherwise they won't know how to repro
<archetech> yup. I know ya need the disro pkg versions etc
<kinkinkijkin> i recognized your name, nice to know you guys over there are putting in some work on getting plasma running on panfrost
<archetech> what im doing when it freezes. etc
<archetech> im not a dev I there I build from source for the Linux from Scratch project
<kinkinkijkin> i ran plasma on libmali on an xu4 once, so far only recorded person ive found to even attempt it, wasnt aware other people would ever try running plasma on a device without advertised desktop gl
<kinkinkijkin> btw it wasn't usable, but it booted, in wayland
<archetech> exactly. that's what I'm troubleshooting for last 2-3 months.
<archetech> diff kernels mesa versions etc
<kinkinkijkin> when i did it with libmali, it required a version of libgbm with a set of patches i made that i lost :'D
<archetech> my tests are for helping armbian and odroid ubu 20.10
<archetech> not just myself or lfs
<archetech> libmali is fine for running gnome on wayland. it works well
<archetech> but panfrost is faster and is close to running plasma
<archetech> best libmali can do is run plasma on softpipe
<kinkinkijkin> suggestion: wait for more complete desktop gl support, use latest master of mesa and latest kernel rc (if it works on n2), and exercise patience/report crashes heavily verbosely
<archetech> thats what ive been doing all along
<kinkinkijkin> not much you can do but help
<archetech> I can give reports on prgress that I do on other irc's and forums
<kinkinkijkin> the most helpful thing right now would be providing an extra hand in the code
<archetech> I wish. no coding experience just configs and compiles
<kinkinkijkin> always time to learn
<archetech> I took assembly class on wasnt my cup of tea but was interesting
<archetech> I've read the Collabra blog too. good articles
<archetech> so im not the enemy. carry on :)
kaspter has quit [Ping timeout: 240 seconds]
kaspter has joined #panfrost
<kinkinkijkin> okay, so the hanging issue looked really similar to the one i had trying to compile a preemptible kernel with hmp awareness a few years back, and hmp awareness is still marked as experimental, so i turned that off, as well as a couple options that would screw with a non-hmp-aware scheduler on an hmp system
<kinkinkijkin> probably not in affectuous ways but always good to go too safe rather than waste an extra two hours reconfiguring, recompiling, and reinstalling the kernel over and over again
<archetech> <alyssa> daniels: Hoping to have the Bifrost scheduler for Christmas. Code name: Santa Clause. Scheduler is the piece G52 needs. Ill wait for the sleigh
youcai has quit [Ping timeout: 264 seconds]
ezequielg has quit [Read error: Connection reset by peer]
camus has joined #panfrost
kaspter has quit [Ping timeout: 258 seconds]
camus is now known as kaspter
ezequielg has joined #panfrost
youcai has joined #panfrost
<kinkinkijkin> hanging hasnt stopped
<kinkinkijkin> hmm
camus has joined #panfrost
kaspter has quit [Ping timeout: 256 seconds]
camus is now known as kaspter
vstehle has joined #panfrost
archetech has quit [Quit: Textual IRC Client: www.textualapp.com]
davidlt has joined #panfrost
<kinkinkijkin> local developer finds forbidden knowledge to halve kernel compilation time, click here to find out this one easy trick that is guaranteed to make recompiling your kernel more enjoyable
<kinkinkijkin> click the link and it just leads to a page containing instructions for disabling nouveau
<HdkR> Is it opening newegg.com and buying a Ryzen 5950X? :P
<kinkinkijkin> silly that's more than a halving
<kinkinkijkin> i like doing all my kernel compilation on-device
<kinkinkijkin> it's not because i dont want to learn how to crosscompile who told you that
<kinkinkijkin> it's because it forms a tighter bond between me and my hardare
<HdkR> At least the Linux kernel is one of the easier projects to cross-compile :D
<kinkinkijkin> i kinda wish nouveau wasn't absolutely massive btw
<kinkinkijkin> nouveau compilation was taking 53% of my kernel compilation time before
<kinkinkijkin> it's bigger than a barely-stripped-down kernel
<kinkinkijkin> probably required though since nvidia doesn't like cooperating and i feel like it probably contains unique bespoke firmwares for every device it supports
<kinkinkijkin> hdkr, how about those new AMD gpus? heard one of them was a little ridiculous
<kinkinkijkin> but also abandonment of hbm boo
<kinkinkijkin> (dw i know why, mostly joking)
<HdkR> ridiculous in what way?
<kinkinkijkin> power compared to their recent offerings overall
<HdkR> The performance is pretty good compared to Nvidia for gaming workloads
<HdkR> Unless you want to play RT games anyway
<kinkinkijkin> i understand hbm was causing pricing issues for high-perf gaming cards potentially
<HdkR> HBM is more expensive than GDDR yea
<kinkinkijkin> since it has to be printed at the same time with the same process as the core
<HdkR> ehhhh, it's on the same silicon substrate, doesn't necessarily need to match the process
<kinkinkijkin> leading to massive wafers, more material loss for each bunk wafer, more possible defect points
<kinkinkijkin> well
<HdkR> They are still separate silicon dies
<kinkinkijkin> oh really?
<kinkinkijkin> hmm
<HdkR> See how there is a physical separation
<kinkinkijkin> i hope amd sticks to hbm for commercial and workspace offerings going forward though
<kinkinkijkin> where the pricing issue is a little less of an issue
kaspter has quit [Ping timeout: 258 seconds]
camus has joined #panfrost
<HdkR> Depends on the market, GDDR6X is still really high bandwidth if your bus is wide enough
<HdkR> and if they tied a larger memory bus to the Infinity Cache idea then it may be good enough for a lot of work loads
<kinkinkijkin> i know for ocl raytracing specifically hbm is a ridiculous advantage
<HdkR> Yea, Ray tracing is almost entirely bandwidth bounded
camus is now known as kaspter
<HdkR> both to vram AND to caches
<kinkinkijkin> ex. radeon vii being the fastest card available for luxcore
<kinkinkijkin> 16gb of ludicrously low-latency high-bandwidth memory is a huge boon
<HdkR> Which Nvidia's RTCore acceleration in Luxcore crushes perf there
<HdkR> So hard sell for professionals even if they want to go AMD
<kinkinkijkin> hmm, actually havent used luxcore recently enough to know they support that now
<kinkinkijkin> neat
<HdkR> Yea, they added OptiX support in....2.5?
<kinkinkijkin> was about to dip for a second since my kernel finished building, but ive got a non-booter
<HdkR> Better one since it has the 6900 XT
<kinkinkijkin> now that's a crush
<HdkR> Sadly anything that manages to go over the infinity cache size will fall off a cliff
<HdkR> Which is why some benchmarks of the 6900XT showed only marginally performance loss going from 1440p to 4k. 1440p had already fallen off
<kinkinkijkin> kernel building again, think ive figured out the non-booting
<kinkinkijkin> silly me forgot the scheduler doesnt like preemption on the xu4
<HdkR> :D
<kinkinkijkin> next to see if i figured out the hanging, the hanging resembles an unstable overclock hang
<kinkinkijkin> but the device isnt over or under clocked
<kinkinkijkin> could just be that my xu4 is getting a little older than optimal and came with a small voltage regulator issue
<kinkinkijkin> within tolerances for shipment and not at all likely to become an issue (vr in question is for the hdmi phy) but it's worried me forever
<HdkR> https://imgur.com/a/O0tGklE Fun bench result of a game falling out of AMD's cache :D
<kinkinkijkin> 19 minute kernel build, impressive reduction by removing nouveau
<kinkinkijkin> aaaaaand i have to do it again
camus has joined #panfrost
kaspter has quit [Ping timeout: 260 seconds]
camus is now known as kaspter
<kinkinkijkin> 20 minutes that time
bbrezillon has joined #panfrost
<kinkinkijkin> okay, got the random hangs stopped
<kinkinkijkin> now the only hang left is starting sway
<kinkinkijkin> alright, my fix doesnt cause corruption and seems to stop the flickering, i just cant get the device to actually keep from hanging, it's a coinflip every time i open a graphical application
<kinkinkijkin> has been since 5.9
<kinkinkijkin> successfully opened sway and terminator, then terminator loading bash hung the system
<bbrezillon> kinkinkijkin: do you have kernel traces?.
<bbrezillon> page faults, job timeouts, ...
<kinkinkijkin> not that i know of, tell me how to collect these and ill get one immediately
<kinkinkijkin> fun note is that i never got as far as terminator without my fix, and i see no more flickering in sway when moving mouse with this fix
<kinkinkijkin> it was EXACTLY what i thought it would be, a buffer was being improperly sized due to host differences
<kinkinkijkin> i just Used A Magic Number to double the size of the buffer and the flickering went away and the system is minourly more stable
<HdkR> =o
icecream95 has quit [Ping timeout: 258 seconds]
<kinkinkijkin> oop the random hang came back
bbrezillon has quit [Read error: Connection reset by peer]
bbrezillon has joined #panfrost
<bbrezillon> kinkinkijkin: none of that should hang the system though
<bbrezillon> if you have GPU faults/timeouts they should appear in the kernel logs
<kinkinkijkin> every hang is punctuated with just uh
<kinkinkijkin> vdd_ldo12: disabling
<kinkinkijkin> and
<kinkinkijkin> vdd_g3d: disabling
<kinkinkijkin> absolutely no other info at crash time
<kinkinkijkin> both of these are xu4-specific names so ill have to go ask in #odroid
<bbrezillon> can you try to disable runtime PM?
<bbrezillon> echo -1 > /sys/devices/platform/ff9a0000.gpu/power/autosuspend_delay_ms
<bbrezillon> the path should be slightly different though
<kinkinkijkin> echo: write error: input/output error
<kinkinkijkin> from root
<bbrezillon> find /sys -name autosuspend_delay_ms|grep gpu
<kinkinkijkin> instant hang
<kinkinkijkin> going to check kern.log again now
<bbrezillon> what, the find made it hang?
<kinkinkijkin> no, setting it successfully
<kinkinkijkin> reproducible
<kinkinkijkin> is there a default setting for this in kernel config? ill set it and rebuild and see if that helps
nlhowell has joined #panfrost
camus has joined #panfrost
kaspter has quit [Ping timeout: 260 seconds]
camus is now known as kaspter
<bbrezillon> for setting what?
<kinkinkijkin> autosuspend_delay_ms
<bbrezillon> not that I know of, but you can hack the driver to disable runtime-PM
<bbrezillon> but you said it was not helping, right?
<kinkinkijkin> it was hanging immediately to change that value
<kinkinkijkin> so it's something to do with pm somewhere, whether that's in panfrost or elsewhere
nlhowell has quit [Remote host closed the connection]
<kinkinkijkin> which file should i go to in order to hack this out?
<kinkinkijkin> mightve found it, panfrost_devfreq.c, at void panfrost_devfreq_suspend?
nlhowell has joined #panfrost
<kinkinkijkin> compiling with that function replaced with a dummy, since it does seem to be what is called for autosuspend, though it's a bit brutish
davidlt has quit [Ping timeout: 240 seconds]
<bbrezillon> kinkinkijkin: that should do the trick => https://gitlab.freedesktop.org/-/snippets/1352
<bbrezillon> but we should also investigate on why suspend/resume cause those hangs :)
<bbrezillon> tomeu: ^
<kinkinkijkin> that's no longer crashing now
<kinkinkijkin> all that's left is the random hangs
<kinkinkijkin> wait, i solved the random hangs somehow, turns out my fix is now causing a hang it seems
stikonas has joined #panfrost
<kinkinkijkin> weird, fix slaughtered any notion of stability but fixed the bug, with a single magic number
<kinkinkijkin> also resulted in a massive perf improvement
raster has joined #panfrost
<kinkinkijkin> okay, the bug is where i thought it was but it wasn't what i thought it was
<kinkinkijkin> hardware bug of some sort
<kinkinkijkin> or a quirk
<kinkinkijkin> it's fixed witth something extremely similar to the fix for what i thought it was though
<kinkinkijkin> i dont quite know how to make diffs by hand (never had a reason to), this is going to be hard to get across
<urjaman> ... yeah that's why there's a program for that? o.O
<kinkinkijkin> no i mean
<kinkinkijkin> i havent run the program by hand
<kinkinkijkin> always had git do it for me
<urjaman> diff -u file1 file2
<urjaman> (but yeah git makes it easier, just have the stuff be in git and life gets a lot easier...)
<urjaman> I tend to stuff things into git (just git init, and git add ., commit that as initial state and off to hacking) even if it natively isnt, just to know what i changed
<daniels> also `git diff` can output a diff ...
<kinkinkijkin> with my hack, there's no flickering but buffering too much data results in a hang
camus has joined #panfrost
kaspter has quit [Ping timeout: 260 seconds]
camus is now known as kaspter
<kinkinkijkin> going off
davidlt has joined #panfrost
chewitt has quit [Quit: Zzz..]
alpernebbi has joined #panfrost
chewitt has joined #panfrost
<robmur01> re XU4: anything involving "half of" anything instantly makes me suspicious of caching issues - T628 with two core groups is the weirdo where half the GPU isn't cache-coherent with the other half
<robmur01> there are potentially some flushes that we might need to do there that we'd never need to do on anything else
nlhowell has quit [Ping timeout: 260 seconds]
<stepri01> robmur01: Yeah - there's a TODO about that in panfrost_job_write_affinity(): "Eventually we may need to support [...] h/w with multiple (2) coherent core groups"
<robmur01> stepri01: true, I guess scheduling data-dependent jobs behind the same L2 is probably more desirable than brute-force flushing both L2s all the time :)
<robmur01> (I failed to consider that we also have control of that...)
<stepri01> ideally user space and kernel work together on it. where jobs don't have data dependencies they can be run on all cores (with occasional flushes as necessary), but sometimes there are data dependencies in which case it's best to restrict to a coherent set of cores
<stepri01> kbase gives the options to user space to work out how to handle it
<stepri01> I haven't looked into what Panfrost user space does. I suspect the main issue is the vertex shading to tiler coherency (the tiler being in the first core group)
alyssa has joined #panfrost
patrik has joined #panfrost
chewitt has quit [Ping timeout: 240 seconds]
kaspter has quit [Ping timeout: 258 seconds]
kaspter has joined #panfrost
<macc24> kinkinkijkin: yeah defconfig for arm64 is quite big
<macc24> this makes pretty small kernel, it's hopefully bare minimum + mtk drivers, https://github.com/Maccraft123/Cadmium/blob/master/kernel/config-duet
kaspter has quit [Ping timeout: 265 seconds]
raster has quit [Quit: Gettin' stinky!]
raster has joined #panfrost
robmur01_ has joined #panfrost
<macc24> robmur01_: wouldn't cache coherency decrease performance due to effective cache size being lowered if all cores have local copy of cache of other cores?
robmur01 has quit [Ping timeout: 258 seconds]
chewitt has joined #panfrost
robmur01_ is now known as robmur01
<robmur01> macc24: I don't think you have the right idea of how coherency works :/
<robmur01> if two cores are working on the same data, there's no "effective size" difference between both holding a line in their own cache without the other's knowledge, and both holding a line in their own cache in a shared state with the ability to snoop updates from each other
raster has quit [Quit: Gettin' stinky!]
raster has joined #panfrost
<macc24> okay
<robmur01> if only one core is using the data, it can still hold that line in a unique state by itself - it only gets shared (and thus copied) if somebody else actually needs it at the same time
<robmur01> (also note that anything I say about coherency is likely to be a mishmash of AMBA ACE terminology which almost certainly doesn't represent what any GPU is using internally...)
* robmur01 is still "interconnect guy" far more than "GPU guy"
<tomeu> bbrezillon: guess it should be easy to come up with a test case for igt that reproduces that
davidlt has quit [Ping timeout: 240 seconds]
bschiett has joined #panfrost
kaspter has joined #panfrost
<bschiett> hi all, trying kmscube with panfrost gives me this, any ideas?
<bschiett> /dev/dri/card0 does not look like a modeset device
<bschiett> drmModeGetResources failed: Operation not supported
<bschiett> failed to initialize legacy DRM
<bschiett> using stable kernel 5.9.12 with buildroot 2020.08.2
<macc24> try other debices using -d parameter
<bschiett> @macc24 I have card0 and renderD128 in /dev/dri, that's all
<macc24> what device do you have?
<bschiett> @macc24 rk3288
<bschiett> so mali T760 MP4
<bschiett> see https://pastebin.com/T98KyCSD and https://pastebin.com/kdfv14Gc for strace kmscube
<macc24> does /sys/devices/platform/ffa30000.gpu exist?
<bschiett> @macc24 yes
<macc24> how about /sys/module/panfrost?
<bschiett> exists also
<macc24> what's in /sys/devices/platform/ffa30000/of_node/status ?
<bschiett> okay
<macc24> do you have vgem enabled in kernel config?
<macc24> and what dts is your device using?
<bschiett> checking for vgem
<bschiett> DRM_VGEM is not enabled in kernel. should it be?
<macc24> i think yes
<macc24> what dts is your device using?
<bschiett> I have a custom dts for my board based on rk3288-firefly-reload-core.dtsi
<robmur01> AFAIK you should have two /dev/dri/card<n> entries, one for the display (which is the one kmscube is looking for) plus another for the GPU
<bschiett> if you check https://pastebin.com/T98KyCSD you can see I still have an issue with binding my lvds driver, I'm not sure if this could be the reason it's not working?
<robmur01> I'm slightly puzzled how you could have a display at all without the DRM driver having registered properly :/
<macc24> robmur01: simple-framebuffer?
<bschiett> I am going to add DRM_VGEM and then report back, give me a few minutes
<macc24> bschiett: can you set ROCKCHIP_LVDS to n too?
<bschiett> @macc24 will do
<macc24> it's near rockchip drm options
<robmur01> macc24: aha, yes, that would make sense :)
<robmur01> OK, so it sounds like it's purely an issue with getting rockchip-drm to probe at all, and nothing to do with panfrost (which *has* worked just fine)
<macc24> robmur01: my guess is that rockchipdrm needs lvds to exist to if has support for lvds compiled into it
<robmur01> bschiett: do you have the DT graph entries describing the connection between VOP and LVDS? That's most likely what the "can't find port" thing is about
<robmur01> they might need some massaging around the VOPB/VOPL shenanigans
<bschiett> ok here is the dmesg output - https://pastebin.com/JxWxymU6
<bschiett> DRM_VGEM enabled and LVDS disabled
<bschiett> @robmur01 I previously had my lvds stuff working but I had hacked the timings into one of the simple panel modules, and I now upgraded to 5.9.12 and still need to figure out how to properly add my LVDS timings for my display without hacking into any of the drivers and do it properly all in the DTS.
<bbrezillon> tomeu: sure, anyone volunteering? :)
<robmur01> OK, that's probably good, but I guess it's still possible that something's changed WRT the endpoint parsing. That -EINVAL still seems most likely to stem from DT stuff to me
<bschiett> @robmur01 are you talking about this line in the dmesg output? [ 0.964715] rockchip-drm display-subsystem: master bind failed: -22
<bschiett> @robmur01 here are the lvds related nodes - https://pastebin.com/FHX0sguw
nerdboy has quit [Ping timeout: 256 seconds]
<bschiett> @robmur01 the connection between VOP and LVDS seems to be in rk3288.dtsi, and VOPB/VOPL are enabled in rk3288-firefly-reload-core.dtsi
<bschiett> @robmur01 in my DTS I enable the LVDS node
<alyssa> dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_mag is evil
<kinkinkijkin> if it's a cache issue then it makes sense with my hack
nlhowell has joined #panfrost
<robmur01> bschiett: according to the bindings you need a further graph edge between the LVDS controller and the panel as well - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/display/rockchip/rockchip-lvds.txt
<robmur01> that appears to be what rockchip_lvds is looking for and complaining about
<bschiett> @robmur01 yeah, I can see that ports { ... } is missing so I just added that and I'm recompiling.
<robmur01> I guess the overall initialisation failure is just because the VOP finds no valid outputs to bind to
<kinkinkijkin> my hack just doubles the length of elements and buffers in memory; providing one core group with 0s and the other with data rather than splitting the data down the middle between core groups then giving unintentional memory would be what my hack is doing in this case
<kinkinkijkin> er, correction, doubles the size of allocations
<kinkinkijkin> except, not every allocation, which would explain why buffering too much data leads to a hang
<kinkinkijkin> the hang might just be a panic from the panfrost driver segfaulting
<bschiett> @robmur01 I now have this - [ 0.964437] rockchip-lvds ff96c000.lvds: [drm:rockchip_lvds_bind] *ERROR* failed to find panel and bridge node
<kinkinkijkin> okay, i need to figure out everything i need to double up with 0s and i may have a suggestion for an actual patch in a few hours
<macc24> kinkinkijkin: \o/
<kinkinkijkin> well, not an actual patch, a temp hack that can actually be patched in
<kinkinkijkin> but still
<robmur01> bschiett: at this point it's probably one for #dri-devel and/or #linux-rockchip - I don't have any actual experience with the LVDS driver or how it's supposed to work in general
<robmur01> my only guess would be to try making sure the panel driver has probed first
<robmur01> it appears it *should* probe-defer and wait for one, but I'm just reading code at face value here
<bschiett> @robmur01 ok thanks, i'll do some more hunting
davidlt has joined #panfrost
BorgCuba has joined #panfrost
patrik has quit [Quit: Leaving]
Green has quit [Quit: Ping timeout (120 seconds)]
Green has joined #panfrost
<tomeu> bbrezillon: I wouldn't mind working on it :)
<kinkinkijkin> kernel 5.4, panfrost userspace side refuses to load, kernel 5.10 kernel panic trying to use more than a tiny amount of data
<kinkinkijkin> doubling the size of various buffers has different affect depending the buffer, cant quite remember all the ones i tested since most just caused instant hanging
<kinkinkijkin> my best guess is that the separate core groups only need two caches for *a couple* things.
<kinkinkijkin> still hanging when terminator tries to load zsh regardless of hack in place or not, but doubling the size of (MALI_ATTRIBUTE_LENGTH * vs->attribute_count) in creation of struct panfrost_ptr T in function mali_ptr panfrost_emit_vertex_data removes most of the flickering and allows terminator to load before crashing
<kinkinkijkin> inside of pan_cmdstream.c
<alyssa> ---Hmm
<kinkinkijkin> OH
<kinkinkijkin> got it to load zsh successfully
<alyssa> :)
<alyssa> L1253 pan_cmdstream.c has a hack for bifrost
<alyssa> what happens if you do that on your board too? (it should be harmless, but might help? idk)
<kinkinkijkin> flickering persists but it's not half of renderables now with current setting
<kinkinkijkin> i will try that
<kinkinkijkin> L1253, which function is that in and what is the first word on that line? i am using gnu nano because installing vim is hard when your wifi drivers don't work
<alyssa> emit_vertex_data
<kinkinkijkin> alright
<alyssa> pan_pack(&bufs[k]...)
<alyssa> "We need an empty.."
<kinkinkijkin> trying that out
<kinkinkijkin> no change in behaviour, lemme try that removing my fix
<kinkinkijkin> great, my rtc has stopped working
<alyssa> You know, this would be easier if I didn't care about it working.. :p
<kinkinkijkin> rtc stopped working independant of this
<kinkinkijkin> no change in behaviour from using that line
<kinkinkijkin> with or without my fix
<alyssa> Ack
archetech has joined #panfrost
<kinkinkijkin> i got an error on-screen trying to run x out of curiosity
<kinkinkijkin> gpu sched timeout
<kinkinkijkin> gpu soft reset timeout
<kinkinkijkin> and a bunch of trace info in binary form
<kinkinkijkin> x actually server a useful purpose
<kinkinkijkin> the deadlock is from the scheduler failing to stop cpus during kernel panic
<kinkinkijkin> also getting a fair few messages in this tracelog about
<kinkinkijkin> cpu idle driver crashing
<kinkinkijkin> alyssa: bbrezillon tomeu idk if this is completely the right pings, got a tracelog on screen just now, see above
<kinkinkijkin> i can reproduce this and take video with my chromebook
kaspter has quit [Quit: kaspter]
stikonas has quit [Ping timeout: 272 seconds]
stikonas has joined #panfrost
alpernebbi has quit [Quit: alpernebbi]
davidlt has quit [Ping timeout: 240 seconds]
<kinkinkijkin> might as well get video of sway too
cphealy_ has quit [Remote host closed the connection]
raster has quit [Quit: Gettin' stinky!]
stikonas has quit [Ping timeout: 272 seconds]
stikonas has joined #panfrost
archetech has quit [Quit: Konversation terminated!]
<kinkinkijkin> looking through historical errors with the same set of errors on kernel panic, looks related to devfreq
rando25892 has quit [Ping timeout: 260 seconds]
archetech has joined #panfrost
rando25892 has joined #panfrost
archetech has quit [Read error: Connection reset by peer]
archetech has joined #panfrost
<bschiett> @robmur01 @macc24 got it working by using panel-lvds in my dts file :-) now back to panfrost :-)
<bschiett> still getting this though ... not sure what i'm missing in my mesa config in buildroot?
<bschiett> [root@rockchip:/tmp]# kmscube
<bschiett> MESA-LOADER: failed to open rockchip (search paths /usr/lib/dri)
<bschiett> failed to load driver: rockchip
<bschiett> MESA-LOADER: failed to open kms_swrast (search paths /usr/lib/dri)
<bschiett> failed to load driver: kms_swrast
<bschiett> MESA-LOADER: failed to open swrast (search paths /usr/lib/dri)
<bschiett> failed to load swrast driver
<bschiett> Segmentation fault
<bschiett> [root@rockchip:/tmp]# ls /usr/lib/dri
<bschiett> panfrost_dri.so
<bschiett> [root@rockchip:/tmp]#
<urjaman> i think kmsro needs to be enabled too? (that makes the rockchip etc stubby drivers to glue the panfrost onto whatever display controllers... sorta kinda i think lol)
<bschiett> @urjaman checking
<bschiett> @urjaman BR2_PACKAGE_MESA3D_GALLIUM_DRIVER_PANFROST=y will cause BR2_PACKAGE_MESA3D_GALLIUM_KMSRO=y to be set
<bschiett> @urjaman BUT ... there is also BR2_PACKAGE_MESA3D_GALLIUM_DRIVER_KMSRO and that one is NOT set
<bschiett> @urjaman so i'm wondering if this is actually what needs to be set? if that is the case then the rule in buildroot is not correct.
<urjaman> idk about how buildroot does it, but kmsro is specified in the same driver list as panfrost ... so in that way the _DRIVER_ one seems the one you need (eg you configure for the gallium drivers panfrost,kmsro)
<urjaman> and the non-driver one is a typo/thinko or something else idk?
<bschiett> @urjaman i'll try enabling it
<bschiett> @urjaman that didn't fix it, still the same thing
icecream95 has joined #panfrost
<icecream95> bschiett: What happens if you run 'ln -s panfrost_dri.so /usr/lib/dri/rockchip_dri.so' ?
<kinkinkijkin> crashes ive been getting seem to be related to voltage missets
<kinkinkijkin> raised the min value of vdd_g3d by 300mV and the crashes got immediately much more predictable, and entirely tied to high gpu load
<bschiett> @icecream95 [root@rockchip:/usr/lib/dri]# kmscube
<bschiett> failed to bind extensions
<bschiett> failed to load driver: rockchip
<bschiett> MESA-LOADER: failed to open kms_swrast (search paths /usr/lib/dri)
<bschiett> failed to load driver: kms_swrast
<bschiett> MESA-LOADER: failed to open swrast (search paths /usr/lib/dri)
<bschiett> failed to load swrast driver
<bschiett> Segmentation fault
<bschiett> (after ln -s panfrost_dri.so rockchip_dri.so)
<icecream95> bschiett: That seems to indicate that kmsro wasn't built at all
<bschiett> @icecream95 I found this in package/mesa3d/Config.in:
<bschiett> 84 # Quote from mesa3d meson.build: "kmsro driver requires one or more
<bschiett> 85 # renderonly drivers (vc4, etnaviv, freedreno)".
<bschiett> maybe this is the reason kmsro wasn't built (?)
<alyssa> kinkinkijkin: that's almost definitely a kernel issue then, not mesa
<kinkinkijkin> yep
<alyssa> aka not my problem™️ :p
<kinkinkijkin> im going through dts and changing values, seeing results
<kinkinkijkin> i am Not Enthused
<kinkinkijkin> the flickering is still somewhat happening though
<macc24> bschiett: got it working?
<macc24> alyssa: ™ is better than ™️ :P
stikonas has quit [Remote host closed the connection]
stikonas has joined #panfrost
<kinkinkijkin> typos in voltage values suck
<bschiett> @macc24 I got lvds working but panfrost not yet (see above), @icecream95 thinks it has to do with kmsro not being built by buildroot
<macc24> kmsro <thinking> seems likely
<bschiett> @macc24 the weird thing is that I see kmsro header files in my build dir. but it seems it is not being built, unless I enable freedreno stuff etc. makes no sense
rando25892 has quit [Ping timeout: 264 seconds]
<macc24> do you have kmsro in -Dgallium-drivers option in meson in mesa compiling script thing?
<bschiett> @macc24 buildroot is building, will check logs in a minute
BorgCuba has quit [Quit: Leaving]
<bschiett> @macc24 i found an override option in mesa3d.mk which does not set -Dgallium-drivers=... correctly, going to change this now and try again