fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
khaled has quit [Quit: Konversation terminated!]
mjw has quit [Quit: Leaving]
hpt has joined #systemtap
hpt has quit [Ping timeout: 250 seconds]
hpt has joined #systemtap
Amy1 has quit [Ping timeout: 256 seconds]
Amy1 has joined #systemtap
Amy1 has quit [Ping timeout: 240 seconds]
Amy1 has joined #systemtap
orivej_ has quit [Ping timeout: 240 seconds]
hpt has quit [Ping timeout: 265 seconds]
hpt has joined #systemtap
Amy1 has quit [Ping timeout: 265 seconds]
yogananth has joined #systemtap
Amy1 has joined #systemtap
Amy1 has quit [Ping timeout: 250 seconds]
Amy1 has joined #systemtap
Amy1 has quit [Ping timeout: 240 seconds]
khaled has joined #systemtap
Amy1 has joined #systemtap
hpt has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej has joined #systemtap
mjw has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
_whitelogger has joined #systemtap
orivej has joined #systemtap
sscox has quit [Ping timeout: 240 seconds]
sscox has joined #systemtap
sscox has quit [Ping timeout: 240 seconds]
sscox has joined #systemtap
sapatel has joined #systemtap
ali_as has joined #systemtap
yogananth has quit [Quit: Leaving]
sscox has quit [Ping timeout: 240 seconds]
orivej has quit [Ping timeout: 240 seconds]
sscox has joined #systemtap
amerey has joined #systemtap
<ali_as> Hi all, I'm trying to follow a post using para-callgraph.stp but the result are not the same. sudo stap para-callgraph.stp 'process.function("*")' -c /bin/ls is the line I'm using as a test, but the output isn't showing internal functions. Any advice?
<ali_as> I've also tried para-callgraph-verbose.stp
<fche> what do you mean internal functions?
<ali_as> In this case functions internal to ls.
<fche> so what are you seeing? and on what distro/version ?
<fche> depending on versions/distros, the basic symbol table in a distro /bin/ls binary may not include those functions at all
<fche> and if you fetch its debuginfo, systemtap can then see everything
<fche> try
<fche> stap -l 'process("/bin/ls").function("*")' << list
<ali_as> Lists 8 functions.
<ali_as> Nothing apart from init_ and fini_ shows up in the actual run though.
<ali_as> I get,
<ali_as> <list of items in directory>
<ali_as> 53 ls(29491):<-_init
<ali_as> 16 ls(29491):<-_fini
<ali_as> 0 ls(29491):->_init
<ali_as> 0 ls(29491):->_fini
<ali_as> And that's it.
sscox has quit [Ping timeout: 240 seconds]
<ali_as> When I pick another program, I can see a hundred+ functions, but only about 5 show up in the run, even thought I know they are being executed.
<ali_as> Actually, that's not true, I don't know it's not giving a different segfault.
sscox has joined #systemtap
dwalkes has joined #systemtap
dwalkes has quit [Remote host closed the connection]
khaled has quit [Quit: Konversation terminated!]
khaled has joined #systemtap
khaled has quit [Client Quit]
khaled has joined #systemtap
<fche> ali_as, try fetching debuginfo for the program
<fche> bug segfaults, you shouldn't be seeing
<ali_as> debuginfo for systemtap or the thing I'm trying to run that's segfaulting?
<sapatel> ali_as, probably for the program you're running, you need the coreutils debuginfo to be installed
* sapatel ran into a similar issue trying to probe the ls command earlier
<ali_as> I have the source for that and I've compiled with debug info, but it's a combination of c and fortran and I'm having real difficulty making it do anything other than crash.
<fche> for thetarget program
<fche> what kernel version are you using? some older ones had bugs that caused such symptoms
<fche> try stap -e 'probe process("/bin/ls').function("*") { println(pp()) }'
<ali_as> 4.15.0-88-generic ubuntu 16.04 I think.
<fche> that's about 50% of the callgraph logic, missing the RETURN probes that have been a problem in some older versions of the kernel
<ali_as> I changed a ' to a ", and added -c to run the program, but all I get is a directory listing and,
<ali_as> process("/bin/ls").function("_init")
<ali_as> process("/bin/ls").function("_fini")
<ali_as> And those two lines.
<fche> ok, so that's not a listing exactly, that's actually running (-e)
<fche> no segv this time?
<ali_as> No segf with ls, it was fine before it just wan't showing any details of it's internals.
<ali_as> Let me try it with the program I'm debugging.
<fche> yeah
<ali_as> Oh, that produced a lot!
mjw has quit [Quit: Leaving]
<ali_as> That produces a lot of information, no parameters, but a lot of function calls and completes to the natural segfault of the program.
<fche> aha
<fche> so waiit
<fche> the segfault was there all along, without systemtap?
<fche> ok in that case para-callgraph etc. should be fine to use - including the return probes
<ali_as> para-callgraph seems to be segafaulting before it reaches the normal program error.
<fche> interesting
<fche> ok
<fche> so if you'd like more context information (function parameters)
<fche> but keep the potentially buggy kernel uretprobes away
<ali_as> Ok.
<fche> then change the script to println(pp(), " ", $$parms$)
<fche> which pretty-prints all the function parameters
<ali_as> This is a big improvement, most of the parameters are ? or ERROR though.
<ali_as> There is weirdness in gdb too, I get <optimised out> for a number of variables even though I'm using -O0.
<ali_as> I think the math is still being done.
<ali_as> Would systemtap show ERROR for floats?
<fche> could be
<fche> yeah we don't support floats really; can't do fp in the kernel
orivej has joined #systemtap
mjw has joined #systemtap
<agentzh> is it possible to use stap to probe kvm guests from the host OS? like using uprobes to instrument the guest OS kernel?
<agentzh> assuming the guest OS uses exactly the same operating system env as the host to simplify the discussions here.
<fche> we have had some cross-vm operation set up a way back
<agentzh> oh wow, is it in the source tree already?
<fche> stap --remote qemu:... something I believe
<fche> an agent in the guest would run staprun commands
<fche> --remote=libvirt://MyVirtualMachine
<fche> check out NEWS
<agentzh> hmm, it's not what i meant. i mean we don't run any staprun or ko in the guest OS at all.
<agentzh> to make it completely transparent to the guest OS.
<fche> in that case you're stuck instrumenting the kvm process per se
<agentzh> for example, when the guest OS enters a lockup or in some nonresponsive state, we can still run stap from the host to analyze the guest OS kernel.
<fche> so you can instrument its visible operations upon the host
<fche> but not internals
<fche> I don't think qemu/kvm exports a facility that would let us do that
<agentzh> qemu/kvm does expose a remote gdb server to debug from the host gdb session.
<agentzh> but that's another story.
<fche> yeah but that's far less than what stap would need
<agentzh> indeed.
<agentzh> so we need special facility exposed by qemu/kvm for it?
<fche> well, ponder what it is that stap would want to probe
<fche> and how that would have to work
<fche> but
<fche> with some small amount of cooperation from the guest (stap-virtguest-something subrpm), the --remote=libvirt:// thing can go some way
<fche> it doesn't require compilers/debuginfo/goo in the guest
<fche> but does require staprun and such
<agentzh> the motivation is to use stap to debug stap in extreme cases.
<agentzh> sometimes the guest OS kernel just enters lockup
<agentzh> and something.
<fche> yeah
<fche> if it locks up, it can't help you run kprobes on itself I guess
<fche> but try instrumenting the qemu/kvm engine itself; it even has some sdt.h markers in it
<agentzh> yeah, static tracepoints may be helpful too.
higgins has quit [Quit: Leaving]
amerey has quit [Quit: Leaving]
mjw has quit [Quit: Leaving]
higgins has joined #systemtap
sapatel has quit [Ping timeout: 240 seconds]
ali_as has quit [Quit: Bye!]