fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
hpt has joined #systemtap
sapatel_ has joined #systemtap
sapatel has quit [Ping timeout: 245 seconds]
irker133 has quit [Quit: transmission timeout]
agentzh has quit [Remote host closed the connection]
hpt has quit [Ping timeout: 244 seconds]
orivej has quit [Ping timeout: 245 seconds]
hpt has joined #systemtap
mjw has joined #systemtap
orivej has joined #systemtap
hpt has quit [Ping timeout: 246 seconds]
khaled has joined #systemtap
sscox has quit [Ping timeout: 276 seconds]
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 244 seconds]
orivej has joined #systemtap
orivej__ has joined #systemtap
orivej_ has quit [Ping timeout: 246 seconds]
orivej has quit [Ping timeout: 244 seconds]
orivej__ has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
khaled has quit [Ping timeout: 264 seconds]
wcohen has quit [Ping timeout: 245 seconds]
hpt has joined #systemtap
khaled has joined #systemtap
orivej has quit [Ping timeout: 245 seconds]
orivej has joined #systemtap
hpt has quit [Quit: Lost terminal]
hpt has joined #systemtap
wcohen has joined #systemtap
sscox has joined #systemtap
hpt has quit [Ping timeout: 264 seconds]
khaled_ has joined #systemtap
khaled has quit [Ping timeout: 264 seconds]
gromero has quit [Quit: Leaving]
sapatel__ has joined #systemtap
sapatel_ has quit [Ping timeout: 246 seconds]
RP has joined #systemtap
<RP> I'm trying to get the 5.2 kernel working with systemtap in the yocto project. We're seeing errors on 32 bit mips and arm.
<RP> The line runtime/linux/access_process_vm.h:#if defined(STAPCONF_LINUX_SCHED_HEADERS) looks odd as I can pass that issue if I remove the ifdef.
<RP> it suggests something about the definitions aren't right. Sadly I know little about systemtap other than the tests that worked no longer work
<fche> RP, yeah these kinds of things happen as kernels / architectures change
<fche> so we have a mechanism in the translator (using buildrun.cxx and runtime/linux/autoconf-* files) to adapt to slight changes in APIs
<fche> maybe the autoconf* test tied to that STAPCONF* macro could be tweaked
<fche> or add a new test that is finely tuned to this particular case
<RP> fche: any tips on how I'd debug that?
<fche> it's trial-and-error really -- pick a particular declaration-error you'd try to fix
<fche> see what minimal changes to the runtime or such code would be required to make it work again
<RP> fche: I guess I'm struggling to get something simpler to debug than "stap --disable-cache -DSTP_NO_VERREL_CHECK /tmp/hello.stp" (where hello.stp is a really simple hello world test)
<fche> add a declaration
<fche> stap -p4 /tmp/hello.stp goes through to the build stage
<fche> if a simpler command line is all you're after
<RP> fche: I'm debugging under emulation for mips so anything to speed this up helps a lot
<RP> fche: thanks!
<fche> yeah unfortunately the build autoconf process is pretty intensive (the first time; results are cached)
<RP> fche: is there a way to make it a bit more verbose?
<fche> ah yes
<fche> the usual way :)
<fche> stap -v or in this case specifically stap --vp 0004 ish to bump up the pass-4 verbosity only
<fche> --vp 0004 == -v -v -v -v for pass 4 only
<RP> fche: thanks, that last bit is the kind of thing its hard to pick up easily hence the question! :)
<fche> aha. well, [man error::pass4] gives other relevant hints
<RP> fche: on a limited embedded target that isn't so easy as it sounds
<fche> the man page? you can run that anywhere else too
<RP> fche: fair enough
<fche> "fair enough" is my middle name
<RP> fche: its a long story but this is basically blocking our kernel upgrades in Yocto which in turn blocks feature freeze for release and causes a load of other problems
<fche> ouch
<fche> stap in the critical path for something else is ... tricky
<RP> fche: just giving context for why I'm trying to avoid going too deeply into learning all about stap! :)
<fche> haha yea
<RP> (I would actually love to but there is a time/place)
<fche> 32-bit mips & arm ... those are pretty far from our mainstream platforms here unfortunately
<RP> fche: which could be why they're not working? I assume they're not in any kind of testing setup?
<fche> yeah, not that I know of
<RP> (I'm trying to figure out if this is something we did or a real just doesn't work scenario)
<fche> we'd be glad to take patches that fix this stuff
<RP> I'd be happy to send them if I can figure them out
<fche> e.g. if the code also breaks 32-bit i686, it becomes much easier to test / fix here
<fche> are you using git master systemtap btw?
<RP> fche: we test mips, arm, x86 and powerpc in 32 and 64. Its only the two I mention that break
<wcohen> fche, the 32-bit i686 is working. I have a guest vm running that and haven't seen problems with it.
<RP> fche: yes, that was my first step
<RP> fche: we have automated testing for all this, e.g. the arm failure is https://autobuilder.yoctoproject.org/typhoon/#/builders/53/builds/992/steps/8/logs/step1c
<fche> ah neat
<fche> consider configuring your builds with --enable-dejazilla so they can upload their test result runs to our public server
<fche> (that's for "make check" or "make installcheck" results)
<RP> fche: we don't actually run that as we're cross compiling so this test is an emulated target test of a simple stap command
<fche> hm, a cross stap ...
<RP> we have been adding support for make check where it has cross support, we've not looked at stap yet though
<fche> be sure to run it with proper -a FOO type parameters
<fche> stap cross testing is to some extent possible with stap --remote ... but we don't have a lot of suchly configured tests
<RP> Its on our nice to have list but probably a way off yet. Patches very welcome though if anyone is interested!
<fche> but yeah if you are running a cross-architecture/version/whatever stap, you'll need stap -a ARCH -r /path/to/kernel/cross/build kinds of flags
<RP> the failing case is simpler as its on target under emulation
<RP> fche: got a handle on what is happening now, looks like a header search path problem
<RP> ./arch/mips/include/asm/addrspace.h:13:10: fatal error: spaces.h: No such file or directory
<RP> that file is in ./arch/mips/include/asm/mach-XXX
<fche> adding extra -I paths to the kbuild is another buildrun.cxx job
amerey has quit [Remote host closed the connection]
<fche> can your local toolchain compile routine out-of-tree modules for your target already?
<RP> fche: yes, we have tests for that
amerey has joined #systemtap
<RP> seems to be missing an -I./arch/mips/include/asm/mach-generic
<RP> if I add that, the autoconf-linux-sched_headers.c define works
khaled_ has quit [Remote host closed the connection]
<RP> fche: confirmed that even if I dump autoconf-linux-sched_headers.c into a module, it still builds so stap isn't picking up the right module compilation flags somehow
khaled has joined #systemtap
khaled has quit [Remote host closed the connection]
<fche> some of those flags should come from the arch-specific kernel-side makefiles
<fche> are you sure you're running stap in a cross-compiling mode, as above (-a / -r ) ?
<fche> similarly to how you'd cross-compile an out-of-tree kernel module ?
<RP> fche: I'm running it on target so that isn't an issue?
<fche> ok then.
<RP> fche: ah, it finds the right flags for building the stap module, just not for running the tests
khaled has joined #systemtap
<fche> the autoconf tests?
<RP> fche: yes. Somehow the flag in KBUILD_CFLAGS is getting lost
<RP> fche: I'm wondering what _KBUILD_CFLAGS := $(call flags,KBUILD_CFLAGS) does with regard to that
<fche> stap --vp 0004 -k .... should let you see exactly how the subsidiary gcc's are run, and -k should let you inspect the makefiles/etc.
<RP> fche: _KBUILD_CFLAGS is empty
<RP> KBUILD_CFLAGS has the flags we need in it
<fche> so I'd play with the Makefile that stap -k leaves behind
<fche> see how the KBUILD_CFLAGS can be propagated properly
<RP> fche: I think I need to understand what that "call flags" filter was trying to do
<RP> fche: I wonder if the kernel removed that filter?
<fche> perhaps
<fche> this indirection was brought in 2008 stap commit e5976ba0af9b828dcc76b3937b5a98fe9c0f6cb8
khaled has quit [Remote host closed the connection]
<fche> RP, been looking through kernel code history, and don't see where the 'flags' make macro was or isn't :)
<RP> fche: right, I'm also struggling to locate it
<RP> fche: I'm trying a build with raw KBUILD_CFLAGS, see what it does. Its slowly working its way through the autoconf tests...
<fche> if the _KBUILD_CFLAGS ... call flags, bit is suspect, take that out of the generated makefile - adjust
<fche> if that works, then let's fix buildrun.cxx to do same (patch welcome if you'd like your name in the super amazing AUTHORS list)
khaled has joined #systemtap
<RP> fche: I will see how this test goes. If that works I can run it on our wider infrastructure (will be an overnight job) and then if that works out I can send a patch :)
<RP> fche: I'm fairly confident this should at least fix the failures we were seeing though
<fche> yeah but it should've broken long ago and elsewhere
<RP> fche: right. I also feel we're missing something
<fche> ah got it
<fche> kernel commit cdd750bfb1f76fe9be8cfb53cbe77b2e811081ab
<fche> nuked the 'flags' filter
<fche> RP, have a possible fix over here
<fche> would like to credit you as a Reported-By: what's your name / email (if you like )?
irker903 has joined #systemtap
<irker903> systemtap: fche systemtap.git:refs/heads/master * release-4.1-80-g7cfac6c / buildrun.cxx: buildrun: adapt to loss of "flags" filter in linux scripts/Kbuild.include http://tinyurl.com/y28eay7r
khaled has quit [Remote host closed the connection]
khaled has joined #systemtap
khaled has quit [Remote host closed the connection]
khaled has joined #systemtap
<RP> fche: Sounds good, Richard Purdie <richard.purdie@linuxfoundation.org>
<fche> ar too late,
<RP> fche: never mind :)
<fche> next time :)
<fche> anyway see if taht fix helps
<fche> we may be able to something different / sneakier if needed
<RP> fche: I will set a test of that away...
mjw has quit [Quit: Leaving]
<fche> righto
<RP> fche: thanks for the help! I've set a test running, will be around 30 mins to see if it worked
<fche> 30 mins at one 32 bit mips is 57,600,000,000 bits
sapatel_ has joined #systemtap
sapatel__ has quit [Ping timeout: 245 seconds]
orivej has quit [Ping timeout: 245 seconds]
<RP> fche: worked :)
<RP> thanks again
<fche> very good
<fche> thanks for the report, and for egging me on to find out the real problem :-)
<RP> fche: no problem, glad I could help figure it out :)
<fche> yup, thanks!
<fche> wouldn't be surprised if we end up having to revisit it, and maybe put in a fake version of the 'flags' filter into our own makefiles
<fche> but so far the new code is surviving fine on our platforms' CI bots too, so probably good enough for now.
<RP> fche: I wondered about that. The kernel implies its no longer needed so time will tell I guess
<RP> fche: do you have any idea on the timescale of the next release?
<fche> just entre-nous, expecting early november
<RP> fche: ok, after us. We'll run with a git version then, thanks :)
<fche> sure!
<fche> when's your deadline?
<RP> fche: we release mid October but we're entering a freeze once we get things working
<fche> aha
sscox has quit [Ping timeout: 244 seconds]
<RP> fche: FWIW there looks to be one 32 bit arm issue left but the failure is much more concise and looks like a headers issue with 5.2
khaled has quit [Quit: Konversation terminated!]
wcohen has quit [Ping timeout: 246 seconds]
wcohen has joined #systemtap
orivej has joined #systemtap
irker903 has quit [Quit: transmission timeout]
sscox has joined #systemtap