narmstrong changed the topic of #linux-amlogic to: Amlogic mainline kernel development discussion - our wiki http://linux-meson.com/ - ml linux-amlogic@lists.infradead.org - Publicly Logged on https://irclog.whitequark.org/linux-amlogic
kaspter has quit [Ping timeout: 240 seconds]
kaspter has joined #linux-amlogic
vagrantc has quit [Quit: leaving]
camus has joined #linux-amlogic
kaspter has quit [Ping timeout: 268 seconds]
camus is now known as kaspter
buzzmarshall has quit [Remote host closed the connection]
Barada has joined #linux-amlogic
Barada has quit [Client Quit]
Anessen97_0 has quit [Quit: Ping timeout (120 seconds)]
Anessen97_0 has joined #linux-amlogic
asdf28 has joined #linux-amlogic
kaspter has quit [Ping timeout: 252 seconds]
kaspter has joined #linux-amlogic
chewitt has quit [Quit: Adios!]
Barada has joined #linux-amlogic
plntyk has quit [Read error: Connection reset by peer]
Barada has quit [Quit: Barada]
acw4 has quit [Ping timeout: 252 seconds]
acw4 has joined #linux-amlogic
kaspter has quit [Ping timeout: 252 seconds]
kaspter has joined #linux-amlogic
camus has joined #linux-amlogic
kaspter has quit [Ping timeout: 268 seconds]
camus is now known as kaspter
sputnik_ has quit [Ping timeout: 240 seconds]
Barada has joined #linux-amlogic
Barada has quit [Quit: Barada]
chewitt has joined #linux-amlogic
<chewitt> sagner the OF messages appear to be harmless, I've not observed issues on a wide variety of devices that could be attributed to them
<chewitt> the reboot issue is odd; I see the exact opposite
<chewitt> with the meson_drv change I now have working reboot on 20+ different Amlogic devices
<chewitt> without the change they all hang in reboot
<chewitt> devices have (or had, when I was testing that change) a mix of mainline u-boot releases between 2010.04 and latest, and including devices with vendor u-boot
<sagner> chewitt: hm, interesting. Its reproducible here on multiple ODROID-N2+. Did you test on a ODROID-N2(+)?
<chewitt> yup, n2 and n2+ both reboot fine here
<sagner> What I also noticed is rmmod'ing meson_dw_hdmi crashes the kernel hard
<sagner> memory corruption of some sort
<sagner> chewitt: hm, maybe kernel config related?
<chewitt> I'm using an emmc module on the n2+ but random sandisk card on the n2
<chewitt> currently running on u-boot 2021.04 + patches https://github.com/chewitt/u-boot/commits/amlogic-2021.04
<sagner> On my side using the 128GB eMMC module sold by Hardkernel on N2+
<chewitt> I'm running whatever they shipped with samples .. think it's an 'orange' 16GB module
<chewitt> so my kernel is 5.12 .. and IIRC I was already on 5.11.y around the time Art send the meson_drv change in
<sagner> chewitt: hm, maybe a dependent patch missing in 5.10?
<chewitt> I have different CONFIG_POWER options to you
<chewitt> probably because I have no great clue what they all do, but might be worth a poke
<chewitt> I also run a rather minimal config .. for LibreELEC so anything not needed/supported for Kodi is out
<sagner> Hm, I see, CONFIG_MESON_GX_VPU_POWER_DOMAIN sounds like something which might make sense on that platform
<chewitt> that's in u-boot though?
<chewitt> G12 is a different one
<chewitt> CONFIG_MESON_EE_POWER_DOMAIN ?
<chewitt> @narmstrong ^
<chewitt> ?
buzzmarshall has joined #linux-amlogic
<sagner> chewitt: oh yeah, sorry, looked at the wrong .config
<chewitt> :)
<chewitt> CONFIG_MESON_EE_PM_DOMAINS (plural) in the kernel
<sagner> Yes, that one is enabled in Linux as well.
plntyk has joined #linux-amlogic
<sagner> chewitt: it seems that the issue disappears when I build CONFIG_DRM_MESON_DW_HDMI=y (as opposed to CONFIG_DRM_MESON_DW_HDMI=m)
<chewitt> ahh, I build most meson stuff =y due to misc. historic issues with =m
<sagner> are kernel modules getting unloaded before shutdown/reboot?
<sagner> My theory is that when using =m meson_drv_unbind gets called after meson_drv_shutdown, and in that combination meson_drv_shutdown does make problem.
<chewitt> sounds plausible
zkrx has quit [Ping timeout: 276 seconds]
zkrx has joined #linux-amlogic
cmeerw has joined #linux-amlogic
<narmstrong> yes it's a side effect, the shutdown of the module is called in a different sequence than when it's built-in
camus has joined #linux-amlogic
kaspter has quit [Ping timeout: 252 seconds]
camus is now known as kaspter
kaspter has quit [Ping timeout: 265 seconds]
kaspter has joined #linux-amlogic
<sagner> narmstrong: just added some printk statements to the meson_drv_shutdown and ..unbind function. It seems that unbind is not called, but its definitly shutdown which makes the system freeze in the module case.
<xdarklight> sagner: what about the .shutdown function in drivers/soc/amlogic/meson-ee-pwrc.c - is that one called?
<sagner> xdarklight: yes, it gets called, before meson_drv_shutdown
<sagner> Btw, its CONFIG_DRM_MESON which must be m to reproduce the issue.
<sagner> xdarklight: the crash happens somewhere in meson_drv_shutdown -> drm_atomic_helper_shutdown
<xdarklight> sagner: is it a crash or the board just hanging?
<xdarklight> sagner: the latter would indicate that for example the clocks for the VPU IP are turned off while the VPU is still doing some work
<sagner> The board just hangs
<xdarklight> just as an experiment: can you make meson_ee_pwrc_shutdown a no-op
<xdarklight> (even if that would "work" I am not sure how a proper fix would look like, but at least it would give us some more info)
<sagner> sure, I'll give that a try
<sagner> xdarklight: bingo, that fixes it
<sagner> So its probably an ordering issue? shutdowns getting called in different order?
<xdarklight> sagner: I am not sure about the ordering part. the question that's in my head about meson_ee_pwrc_shutdown is: "why do we even need it?". there's a connection between the power domain and the VPU which we describe in the .dts(i). once the VPU driver is unloading this should also turn off the power domain (for which we implement the correct callback in the genpd subsystem). so in theory the extra shutdown function is not needed
<xdarklight> but also there's the comment in meson_ee_pwrc_init_domain about special handling of the VPU power domain
<xdarklight> maybe narmstrong or Kevin (currently not on IRC?) have an idea on how this can be improved
<sagner> xdarklight: just checked, CONFIG_DRM_MESON=m -> CONFIG_DRM_MESON=y changes order of shutdown functions: built-in calls meson_drv_shutdown before meson_ee_pwrc_shutdown
<sagner> In both cases I tested with CONFIG_MESON_EE_PM_DOMAINS=y
<xdarklight> sagner: you could give this a try (I only compile-tested this and as I mentioned before I am not sure about the big TOFIX comment in the original code): https://pastebin.com/igWgQ2Fq
<xdarklight> sagner: the idea of that patch: maybe it's enough to keep the clocks by just bypassing clk_bulk_disable_unprepare (when turning off the power domain) if we didn't previously call clk_bulk_prepare_enable (done while turning on the power domain)
<xdarklight> sagner: the clocks already have CLK_IGNORE_UNUSED set so we're keeping the state from the bootloader even if the VPU driver is not loaded (so my hope is that this may work, but I am all but sure on this one ;))
<sagner> Hm, I see instead of doing the disable in shutdown
<sagner> I'll give it a try
kaspter has quit [Ping timeout: 240 seconds]
kaspter has joined #linux-amlogic
sputnik_ has joined #linux-amlogic
ballerburg9005 has joined #linux-amlogic
vdehors has quit [Ping timeout: 252 seconds]
<sagner> xdarklight: seems to work here with CONFIG_DRM_MESON=m
<xdarklight> sagner: can you also test with =y and =n (the latter case is what's describe in the TOFIX comment if I understand it correctly)
<sagner> xdarklight: CONFIG_DRM_MESON=n looks good, bootup and reboot seems to work
<sagner> xdarklight: with CONFIG_DRM_MESON=y I do get a stacktrace when I CONFIG_DRM_MESON_DW_HDMI=m and don't load that module
<sagner> But I think that was already the case befor
<sagner> And also, weired combination :)
<xdarklight> sagner: that sounds good to me, thanks for testing! I'll send this as RFC patch later and asking Neil and Kevin to give their review comments
<xdarklight> sagner: are you subscribed to the mailing list? if not, it would be great if I could get your email address so I can Cc you on the patch
<sagner> linux-amlogic? I am subscribed, but a CC would be nice still, then its in my inbox :)
chewitt has quit [Quit: Zzz..]
<xdarklight> I'll take the email from https://github.com/torvalds/linux/commit/9e454e37dc7c0ee9e108d70b983e7a71332aedff then (assuming it's still valid because it's a fairly recent change)
<sagner> xdarklight: that works
<sagner> xdarklight: This is the trace I get on reboot when not loading meson_dw_hdmi while meson-drm is built-in: https://pastebin.com/874qKmp7
vdehors has joined #linux-amlogic
<xdarklight> sagner: hmm, that part I am not sure about. it was added with https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fa0c16caf3d73 recently
<xdarklight> the root cause of the fix in that commit is not clear to me
<sagner> xdarklight: you mean why that patch fixes it in some cases?
<sagner> It basically is like the opposite: DRM/DW HDMI driver still active while power domain gets shutdown...?
<xdarklight> sagner: yes, it's not clear to me why fa0c16caf3d73 also fixes some of the shutdown issues
<sagner> Souns somewhat logical to me: With no shutdown function in the Meson DRM driver etc the driver will sooner or later access one of the registers (vsync irq or something? would explain why its racy: If the system reboots before this happens, no harm done.. but if not, the driver accesses unpowered registers...)
zkrx has quit [Ping timeout: 268 seconds]
cmeerw has quit [Ping timeout: 250 seconds]
<xdarklight> then at least I am not sure why it's causing that crash and I am running out of time for today
zkrx has joined #linux-amlogic
asdf28 has quit [Ping timeout: 265 seconds]