fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
khaled has quit [Quit: Konversation terminated!]
hpt has joined #systemtap
sscox has joined #systemtap
hpt has quit [Ping timeout: 246 seconds]
hpt has joined #systemtap
orivej has joined #systemtap
hpt has quit [Ping timeout: 248 seconds]
hpt has joined #systemtap
gromero has quit [Ping timeout: 248 seconds]
hpt has quit [Ping timeout: 245 seconds]
hpt has joined #systemtap
sapatel has quit [Ping timeout: 276 seconds]
yog_ has joined #systemtap
hpt has quit [Ping timeout: 276 seconds]
hpt has joined #systemtap
khaled has joined #systemtap
orivej has quit [Ping timeout: 276 seconds]
higgins` has joined #systemtap
khaled_ has joined #systemtap
hpt has quit [Ping timeout: 245 seconds]
khaled has quit [*.net *.split]
fche has quit [*.net *.split]
wcohen has quit [*.net *.split]
higgins has quit [*.net *.split]
fche has joined #systemtap
wcohen has joined #systemtap
yog_ has quit [Ping timeout: 245 seconds]
mjw has joined #systemtap
sscox has quit [Ping timeout: 248 seconds]
orivej has joined #systemtap
sscox has joined #systemtap
dmalcolm_ has quit [Quit: Leaving]
sapatel has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
tromey has joined #systemtap
orivej has joined #systemtap
amerey has quit [Quit: Leaving]
amerey has joined #systemtap
sapatel_ has joined #systemtap
sapatel_ has quit [Client Quit]
sapatel has quit [Ping timeout: 248 seconds]
sapatel has joined #systemtap
agentzh has joined #systemtap
<agentzh>
hi guys, i noticed a strange issue that some times a uprobe is never triggered and stap just hangs forever. how to debug such things?
tromey has quit [Quit: ERC (IRC client for Emacs 26.2)]
<agentzh>
-DDEBUG_UPROBES?
<agentzh>
seems like -DDEBUG_UPROBES_RIP might be useful too.
<agentzh>
hmm, not very useful. just got a single message "_stp_handle_start:178: cannot map pid 0 to host namespace pid" regardless it hangs or not.
<fche>
hmm
<fche>
sysrq-t ?
<fche>
if a uprobe can hang something, the kernel's been a bad bad boy
<agentzh>
not hanging the target, just staprun itself.
<fche>
so an interrupt doesn't stop it?
<agentzh>
fche: will look into sysrq-t. thanks
<agentzh>
fche: ctrl-c does stop it. just no probes fired.
<fche>
sometimes the uprobe removal must wait until some userspace threads pass some particular section, IIRC
<fche>
ok
<fche>
so not a hang
<fche>
just no probes being fired
<agentzh>
the target process is calling usleep(1) in a tight loop.
<fche>
ok that's a totally different situation :)
<agentzh>
and the stap script just probes on usleep function entry.
<agentzh>
yeah, got "_stp_do_relocation:74: found kernel _stext load address: 0xffffffffa9e00000"
<agentzh>
oh, this is different
<agentzh>
umodule_relocate
<agentzh>
no such line it seems
<fche>
yup, wonder if it's a buildid problem
<agentzh>
but if it's a build id problem, it should never work instead of just randomly?
<agentzh>
just wondering
<fche>
yes
<fche>
nothing's being upgraded under the covers I assume
orivej has quit [Ping timeout: 248 seconds]
<fche>
I'd probably add some dbug() type instrumentation to linux/uprobes-inode.c and/or sym.c into those paths to see what's going on
<agentzh>
I don't see any output lines matching "_stp_umodule_relocate" in a good run
<agentzh>
either
<fche>
yeah in general this part needs better diagnostics
<fche>
back before inode-uprobes (linux 3.5 era?), we had a pretty systematic probe registration/attempt/unregistration tracing with -DDEBUG_PROBES IIRC
<agentzh>
okay, i'll try peeking into the uprobes-inode.c and sym.c files.
<agentzh>
thanks for the suggestion.
<agentzh>
oh, the utrace era?
<fche>
yeah :)
<agentzh>
:)
<fche>
but yeah, patches to improve these diagnostics would be super welcome