#systemtap on 2019-11-21 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

01:20 sscox has quit [Ping timeout: 250 seconds]

01:25 orivej has quit [Ping timeout: 252 seconds]

01:38 sscox has joined #systemtap

01:55 hpt has joined #systemtap

03:51 orivej has joined #systemtap

07:05 hpt has quit [Ping timeout: 265 seconds]

07:34 tonyj has quit [Remote host closed the connection]

08:46 dmalcolm__ has joined #systemtap

08:48 dmalcolm_ has quit [Ping timeout: 245 seconds]

08:51 dmalcolm__ has quit [Ping timeout: 240 seconds]

08:59 mjw has joined #systemtap

09:57 orivej has quit [Ping timeout: 240 seconds]

10:00 orivej has joined #systemtap

13:04 gromero has joined #systemtap

13:26 wcohen has quit [Ping timeout: 240 seconds]

13:43 tromey has joined #systemtap

13:58 hpt has joined #systemtap

14:23 wcohen has joined #systemtap

14:44 hpt has quit [Ping timeout: 250 seconds]

15:42 orivej has quit [Ping timeout: 245 seconds]

17:02 khaled has joined #systemtap

17:54 tonyj has joined #systemtap

17:58 simon__ has joined #systemtap

17:59 simon__ has quit [Client Quit]

18:01 simon__ has joined #systemtap

18:26 sscox has quit [Ping timeout: 240 seconds]

18:41 sscox has joined #systemtap

19:09 orivej has joined #systemtap

19:36 <simon__> hello! I'm trying my very first systemtap on some own code... and it's almost working! can anybody kindly help me debug it?

19:36 <simon__> I'm following the instructions here: https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps

19:37 <simon__> here's my code:

19:40 <simon__> https://controlc.com/4af1b930

19:40 <simon__> seems to compile and run as normal, and I can even list the instrumentation points :-)

19:41 <simon__> however... when I try this command it fails: sudo stap foo_tap_all.stp -c ./foo

19:41 <simon__> cat foo_tap_all.stp

19:42 <simon__> probe foo* { printf("%s\n", probestr); }

19:42 <fche> hi

19:42 <simon__> it says: semantic error: while resolving probe point: identifier 'foo*' at foo_tap_all.stp:1:7

19:42 <simon__> and: semantic error: probe point mismatch: didn't find any wildcard matches (similar: nfs, nfsd, vfs, vm, _nfs): identifier 'foo*' at :1:7

19:42 <simon__> hi!

19:43 <simon__> what am I missing here? why doesn't it work?

19:43 <fche> hm, probe foo* - does that expand to something via another stap script here?

19:43 <fche> probe process("foo").mark("*") is how you'd refere to them

19:43 <fche> just how the -L operation lists them

19:45 <simon__> thanks... I was just following the syntax on the wiki page posted above... but your syntax makes more sense :-)

19:45 <fche> ah let's fix the wiki page :)

19:46 <simon__> :-)

19:46 <fche> ah I see what's goign on there

19:46 <fche> see the 'adding a tapset' subsection

19:46 <simon__> If I change foo_tap_all.stp to: probe process("foo").mark("*") { printf("%s\n", probestr); }

19:46 <fche> you skipped that part

19:46 <simon__> ahhh...

19:46 <fche> it's not mandatory by any means, but if your script is going to refer to those symbols, you need it

19:48 <simon__> ahhh... so I need a tapset *and* the .stp file...

19:48 <fche> or have your .stp file use process().mark() directly.

19:49 <simon__> hmmm... the example uses "probestr" which I guess is in the tapset...

19:50 <fche> would have been

19:50 <fche> if you put it there

19:50 <fche> search the wiki page for "probestr = sprintf ("....')

19:50 <simon__> It says "The tapsets are typically placed in /usr/share/systemtap/tapsets" ... is there also a way to have the tapsets locally?

19:51 <fche> yes, you can put some in any directory you like, and name it with stap -I/path/../

19:51 <simon__> Yep, 'probestr' is in the tapset and referenced by the .stl file in the wiki page example...

19:51 <simon__> thanks! I'll try that...

19:52 <fche> if you just want one-off scripts, don't bother with tapset files

19:52 <fche> you could put those probe aliases right into your final .stp file, if it provides value

19:52 <simon__> hmmm.. okay

19:58 <fche> btw, what was the attraction of the dtrace style trace macros for your purposes? just checking why debuginfo-based instrumentation is not sufficient

19:58 <fche> for your case

19:58 <simon__> So I now have a 2 line foo_tap_all.stp file which it seems to like:

19:59 <simon__> line 1: probe foo_bar_enter = process("foo").mark("bar_enter") { a = $arg1; b = $arg2; probestr = sprintf("%s(a=%d, b=%d)", $$name, a, b); }

19:59 <simon__> line 2: probe foo* { printf("%s\n", probestr); }

19:59 <simon__> however, when running this command there are other errors: probe foo* { printf("%s\n", probestr); }

20:00 <fche> what are the errors and the command invocation?

20:00 <simon__> error line 1: /usr/share/systemtap/runtime/linux/access_process_vm.h: In function ‘__access_process_vm_’:

20:00 <simon__> error line 2: /usr/share/systemtap/runtime/linux/access_process_vm.h:35:29: error: passing argument 1 of ‘get_user_pages’ makes integer from pointer without a cast [-Werror=int-conversion]

20:01 <fche> ok, pass-4 errors are described in [man error::pass4]

20:01 <simon__> error line 3: ret = get_user_pages (tsk, mm, addr, 1, write, 1, &page, &vma);

20:01 <fche> usually means that your copy of systemtap is much older than your kernel

20:01 <fche> a recent stap version should work with any older kernel

20:02 <simon__> stap --version says: Systemtap translator/driver (version 2.9/0.165, Debian version 2.9-2ubuntu2 (xenial))

20:02 <simon__> so getting a newer version of stap should fix that error?

20:02 <fche> yes.

20:03 <fche> 2.9 is 2015-10-08, so four years old

20:03 <simon__> kk thanks! I'll see if I can do that...

20:03 <fche> if your kernel is much newer than that, you need

20:03 <simon__> yeah, I'm running Ubuntu 16.04 LTS ...

20:03 <fche> sapatel, hey btw, were you going to update https://sourceware.org/systemtap/wiki/ etc. with the new version numbers?

20:04 <sapatel> fche, sure but, when I ran the script to update the HTML docs, it was producing an error

20:05 <fche> yup, this is a separate thing

20:05 <fche> this is manual

20:05 <sapatel> ohh ok

20:05 <sapatel> fche, alright I'll update them

20:05 <fche> hm you'd need to create a wiki account first

20:06 <simon__> "at was the attraction of the dtrace style trace macros for your purposes?" I would like to instrument a larger C/C++ code base with about 1,000 source files to trace function calls at run-time...

20:06 <fche> simon__, you don't need the dtrace probe macros for that - you certainly wouldn't want thousands of them

20:06 <fche> sapatel, use wiki account name SagarPatel

20:07 <sapatel> gotcha

20:07 <fche> https://sourceware.org/systemtap/examples/#general/para-callgraph.stp <<< simon__

20:07 <simon__> I could auto instrument with or without dtrace style trace macros, but it seems to me that if I used dtrace then I could potentially include the instrumentation in the release build and then be able to pick and choose which functions to trace?

20:07 <fche> you can do that regardless

20:08 <fche> release build - if that build lacks all debuginfo, then yeah dtrace style markers are your remaining choice

20:13 <simon__> thanks! so I tried para-callgraph.stl on ./foo but got this error in addition to the version based errors mentioned above...

20:14 <fche> the version stuff has to be dealt with independently, no matter what

20:14 <simon__> command: sudo stap para-callgraph.stp 'process("./foo").function("*")' 'process("./foo").function("main")' -c ./foo

20:14 <fche> that old stap can't work with this new kernel

20:14 <simon__> error line 1: WARNING: function _start return probe is blacklisted: keyword at para-callgraph.stp:24:1

20:14 <simon__> error line 2: source: probe $1.return { trace(-1, $$return) }

20:15 <simon__> I'm looking into how to upgrade systemtap... do you think this latest error is also to do with the systemtap version?

20:16 <fche> I don't see an error in what you quoted, just a warning

20:16 <simon__> ahhhh... good point... :-)

20:17 <simon__> hmmm... systemtap seems so specialized that it's difficult to google for instructions on how to get a newer version for

20:17 <simon__> Ubuntu... :-(

20:17 <fche> you can build it yourself, or file a bug with the ubuntu maintainers to update their version

20:17 <fche> is 16.04 lts still in supported category?

20:18 <simon__> yep... LTS stands for Long Term Support: https://wiki.ubuntu.com/Releases

20:19 <simon__> until 2021 :-)

20:19 <fche> ok, so yeah, I'd file a bug with them about getting a newer version built

20:19 <fche> some of the ubuntu releases keep track of upstream stap quite well

20:19 <fche> dunno why that particular one would be 4 years out of date, but alas

20:19 <fche> anyway - building it for yourself is not that bad

20:20 <simon__> do you happen to have a link for instructions for that?

20:21 <fche> https://sourceware.org/systemtap/getinvolved.html try it out

20:23 <simon__> thanks! I'll have a go :-)

20:25 <simon__> Is this recent one okay to use? https://sourceware.org/systemtap/ftp/releases/systemtap-4.2.tar.gz

20:25 <fche> yes

20:26 <simon__> kk thanks!

21:08 CME_ has joined #systemtap

21:09 CME has quit [Remote host closed the connection]

21:10 ema_ has joined #systemtap

21:10 CME_ is now known as CME

21:12 mjw has quit [Ping timeout: 240 seconds]

21:12 ema has quit [Ping timeout: 240 seconds]

21:12 mjw has joined #systemtap

21:19 mjw has quit [Quit: Leaving]

21:20 mjw has joined #systemtap

21:34 mjw has quit [Quit: Leaving]

21:51 khaled has quit [Quit: Konversation terminated!]

21:53 <simon__> fche, I managed to ./configure ... :-)

21:53 <simon__> Now make all is progressing... :-) fingers crossed...

21:57 <simon__> hmmmm... error: "Making all in python: ImportError: No module named setuptools"

21:59 <simon__> seemed to fixed with the command: sudo apt-get install python-setuptools

22:00 <simon__> hmmm... now it's saying that "make all" has finished :-)

22:00 <simon__> is there any way to test it without installing it? :-)

22:02 <simon__> hmmm... after I built up some courage I tried: sudo make install

22:03 <simon__> but it failed :-(

22:04 <simon__> error seems to be: cp: cannot stat '/home/simon/systemtap-4.2/doc/SystemTap_Tapset_Reference/tapsets.pdf': No such file or directory

22:05 <simon__> indeed... that PDF file does not exist :-(

22:06 <simon__> this command finds 4 other PDFs... but not that one... any ideas? find . -type f | egrep -i pdf

22:16 <fche> hmm

22:16 <fche> well a 'make -k install' or even -i should be okay, just skip the docs

22:16 <fche> you don't have to use sudo make install btw - configure with a personal directory as --prefix= and then you just need sudo $prefix/bin/stap but nothing else

22:22 <simon__> kk thanks! trying... :-)

22:24 mjw has joined #systemtap

22:24 <simon__> I'm trying: $ ./configure --prefix=/home/simon/systemtap-4.2-19055 && make all

22:25 <simon__> But ./configure also said: Running systemtap uninstalled, entirely out of the build tree, configure: is not supported.

22:26 <simon__> so the prefix thing is a local install and not the build tree?

22:26 <fche> yeah, a --prefix makes it work easier (fewer environment variable complications)

22:26 <fche> build tree is the place you run configure/make from

22:26 <fche> the prefix is the directory under which 'make install' will copy the results, and from where you can run most easily

22:26 <simon__> kk

22:28 <simon__> $ ~/systemtap-4.2-19055/bin/stap --version

22:29 <simon__> Systemtap translator/driver (version 4.2/0.165, non-git sources)

22:29 <simon__> do I need to set up any env vars etc or can I just run it like that?

22:29 <fche> just run it like that

22:29 <simon__> also, is there anything like make test?

22:29 <fche> make check

22:29 <fche> sudo make installcheck

22:30 tromey has quit [Quit: ERC (IRC client for Emacs 26.1)]

22:30 wcohen has quit [Ping timeout: 250 seconds]

22:31 <simon__> Makefile:513: recipe for target 'installcheck' failed

22:33 <simon__> There's no obvious error message etc on the terminal :-(

22:33 <simon__> this is lower down: ./execrc: 1: eval: runtest: not found

22:38 <simon__> anyway regardless, now this command from earlier works except for the warning: sudo ~/systemtap-4.2-19055/bin/stap para-callgraph.stp 'process("./foo").function("*")' 'process("./foo").function("main")' -c ./foo

22:40 <fche> the testsuite relies on a package called dejagnu

22:40 <simon__> hmmm... except there is no final line for: TRACE(FOO_MAIN_LEAVE());

22:41 <fche> not sure why that'd be but note that this para-callgraph script doesn't use the dtrace-flavoured macros at all

22:42 mjw has quit [Quit: Leaving]

22:43 <simon__> sudo make installcheck appears to be running now :-)

22:43 <fche> it can take several hours to complete

22:43 <fche> so don't sit there waiting for it

22:43 <fche> -and- depending on machine (kernel etc.), several thousand PASSes and several hundred FAILs are about normal

22:45 <simon__> hehe now I know why it's not advertised in the build instructions :-)

22:47 <fche> ah it's there

22:47 <simon__> another thing I'm confused about: ./foo outputs its "- r=55" at the end... but the callgraph is output even after all that, i.e. later... why?

22:47 <fche> To run the full test suite from the build tree, install dejagnu,

22:47 <fche> then run with root privileges:

22:47 <fche> # make installcheck

22:47 <simon__> kk :-)

22:48 <simon__> how would I run the para-callgraph example on ./foo but ask it to only output for main() and not bar() ?

22:48 <fche> if you have only two functions, but want to trace only one of them, it's not a callgraph any more :)

22:48 <simon__> i.e. all functions but exclude bar()

22:49 <fche> couple of ways

22:49 <simon__> haha :-)

22:49 <fche> there is a {foo,bar} syntax supported in function name strings

22:49 <fche> or

22:50 <fche> trace them all with a broad wildcard, but skip ones you don't want via a runtime test ( if(ppfunc() =~ "main") next; ...

22:53 <simon__> another thing I'm confused about: ./foo outputs its "- r=55" at the end... but the callgraph is output even after all that, i.e. later... why?

22:56 <fche> the object code can have more bits in it after the printf

22:58 <simon__> how do you mean?

22:58 <fche> after the printf runs within foo proper

22:58 <fche> stap can still continue running monitoring the rest of foo; plus stap's own i/o buffering can delay its reports from earlier in time

22:59 <simon__> ahhhh... makes sense... is there a way to make stap have line buffered output?

22:59 <fche> it's not a granularity issue but a timing issue really -- it's best not to intermingle the output streams

23:00 <fche> you wouldn't want dogs & cats living together etc.

23:00 <simon__> hehe kk

23:01 <simon__> when I run time ./foo on its own then it takes "real 0m0.002s" but with stap time says "real 0m0.935s" ... why the huge overhead and what is that spent, on?

23:01 <fche> stap's own processing time - especially the first time you run a script - it needs to do a bunch of work to analyze your program, translate the script to C and compile THAT etc etc

23:02 <fche> there's a lot going on behind the scenes

23:02 <simon__> is that before foo starts?

23:03 <fche> if you're running stap -e 'probe ...' -c foo then mostly yes

23:06 <simon__> I've noticed that the first time I run it with a new config -- i.e. .function("*") changed to .function("bar") -- then it can take even longer, e.g. 3.2 seconds... but running it again causes it to revert to the 1 second again... is something cached in the background or why even longer on the first try?

23:07 <fche> yes, cached

23:07 <fche> add a stap -v to ask it to report a little about that

23:11 <simon__> thanks... now I see 5 passes :-)

23:21 wcohen has joined #systemtap

23:22 <simon__> I also now got the foo_tap_all.stp file to work as expected :-)

23:22 <simon__> it seems similar results and execution time to the para-callgraph.stp example, except you need to insert all the macros...

23:24 <simon__> I have seen it take up to 10 seconds on a non-cached run the first time on this little foo.c example...

23:24 <simon__> If I had an enormous 1,000 object file executable, would it take 1,000 * 10 seconds on the first run?

23:42 simon__ has quit [Ping timeout: 245 seconds]