fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<sfix> i'm using a systemtap script to get a userspace backtrace on the return of a kernel function for a particular pid, but i'm getting a bunch of warnings about missing symbols for processes i dont want to trace which eventually ends with stap exiting due to too many errors. is there a way i can prevent that?
<sfix> ah, seems i missed the -x option, that does the trick if i add a check for target() in my probe
efiop has quit [Ping timeout: 255 seconds]
gmg has quit [Remote host closed the connection]
scox has quit [Ping timeout: 260 seconds]
hpt has joined #systemtap
_whitelogger has joined #systemtap
gmg has joined #systemtap
gmg1 has joined #systemtap
gmg has quit [Ping timeout: 260 seconds]
groleo has joined #systemtap
bsingh has joined #systemtap
bsingh has quit [Client Quit]
gmg has joined #systemtap
gmg1 has quit [Remote host closed the connection]
gmg has quit [Client Quit]
_whitelogger has joined #systemtap
hpt has quit [Quit: leaving]
groleo has quit [Ping timeout: 268 seconds]
_whitelogger has joined #systemtap
wcohen has quit [Ping timeout: 260 seconds]
groleo has joined #systemtap
groleo has quit [Ping timeout: 252 seconds]
efiop has joined #systemtap
<fche> there you go!
<fche> sfix, and you could use other filters like execname()=="your_program" or uid()==$your_uid
<sfix> i was doing a check for pid() == $1 before and passing the pid in as an argument, but stap still unwind info for unrelated processes, using -x did the job!
<sfix> still wanted*
<fche> yeah, -x PID is a better way than passing a pid as a $1 type argument
<fche> (the runtime supports -x specially and allows module reuse, whereas $1 substitution requires a complete stap recompilation)
<sfix> ah, good to know
<sfix> i have another question about kernel backtraces, currently when i call sprint_backtrace() i only get the current frame and the next is shown as 0x0 (Inexact), is there something i'm doing wrong for that to be happening?
<fche> how deep within the kernel are you?
<sfix> i was using kprobes before and calling save_stack_trace() within my probe before, which worked quite well
<sfix> my probe is on an LSM hook
<fche> in a module? wonder if its debuginfo (well, unwind data) is not available without stap -d foo.ko or stap --all-modules
<sfix> ah no, the LSM is in-kernel (it's selinux)
<sfix> i do have debuginfo available for the kernel i'm running
<fche> unwinding relies on a small piece only ... does print_backtrace() give something more useful?
<sfix> good question, let me try that
<sfix> nope, unfortunately not
<fche> our backtracer prefers full unwinding vs. frame-pointer-based heuristics, but can back down to it. so if it's giving nothing, something's strange
<sfix> for a bit of context my stap script currently looks like this: http://termbin.com/o0bg and i'm invoking it with stap -d /usr/bin/vim --ldd -x $pid
<sfix> does providing -d for a program/shlib maybe prevent unwind info for kernel symbols being available?
<fche> no
<fche> the kernel.function() probe should cause -d kernel to be implicit
<fche> but you could try manually adding it just for giggles
<fche> what is the output?
<sfix> sure, just a sec
<sfix> hmm, nope, still just 2 lines in the backtrace:
<sfix> 0xffffffffb1383950 : avc_has_perm+0x0/0x1a0 [kernel]
<sfix> 0x0 (inexact)
wcohen has joined #systemtap
<fche> hm weird, I wonder if maybe some assembly code is what's calling this function,
<fche> and the assembly doesn't include correct .cfi* codes to allow unwinding through it
<sfix> well that function is called by other SELinux functions, so if i make it a .return probe instead then calling print_backtrace() gets me the correct info for the caller, but still 0x0 as the 2nd item
<fche> hm could try stap -DDEBUG_UNWIND for more data, but I suspect something is weird about the caller function.
<sfix> ah unwind_frame:1178: Module /usr/lib/debug/usr/lib/modules/4.10.8-100.fc24.x86_64/vmlinux: no unwind frame data
<fche> a fedora kernel should just work (tm) dammit ... wonder what's up
<fche> ok seeing something similar over here. something has definitely gotten pooched
<fche> methinks there's a kernel build change/problem, but we'll get to the bottom of it
<sfix> fche: thanks for the help! i'll keep an eye on the BZ and leave a comment if i get anything new
<fche> righto
wcohen has quit [Ping timeout: 260 seconds]
wcohen has joined #systemtap
wcohen has quit [Ping timeout: 258 seconds]
wcohen has joined #systemtap
wcohen has quit [Ping timeout: 258 seconds]
wcohen has joined #systemtap
wcohen has quit [Ping timeout: 260 seconds]
wcohen has joined #systemtap
nkambo has quit []
wcohen has quit [Ping timeout: 260 seconds]
sona has joined #systemtap
sona has quit [Ping timeout: 240 seconds]
sona has joined #systemtap
sona has quit [Ping timeout: 260 seconds]
wcohen has joined #systemtap