fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
simon__ has joined #systemtap
<simon__> fche I'm back! :-)
<simon__> I tried out the callgraph on the big executable and it said:
<simon__> Pass 2: analyzed script: 93757 probes, 79948 functions, 1 embed, 3 globals using 1307148virt/1277392res/7012shr/1272896data kb, in 21310usr/270sys/21594real ms.
<simon__> virtual memory exhausted: Cannot allocate memory
<simon__> Pass 4: compilation failed. [man error::pass4]
<simon__> real 36m34.129s
<simon__> Also, my laptop seemed to completely lock up too while those 36m were going on... :-(
<simon__> Does this mean that neither the macro method nor the callgraph method is going to help me instrument and trace this C/C++ project with ~ 80k functions? Because even if it did not run out of memory, who'd want to wait 30+ minutes to start the executable?
hpt has joined #systemtap
simon__ has quit [Ping timeout: 246 seconds]
simon__ has joined #systemtap
<simon__> hi fche!
<fche> yo
<simon__> hello again!
<simon__> can you see my messages and question from 16:33 ? Or about 1.5 hours ago?
<fche> hm, 90000 probes is a huge number, yea
<fche> just got back
<simon__> kk nice :-)
<fche> (9pm here)
<fche> shouldn't be 30min, unless perhaps the machine was really short of ram and was paging to death
<simon__> kk :-)
<fche> on a larger ram box, that should be manageable, I suspect, but it's larger than usual
<fche> thousands, ok, hundred thousand ... wow!
<simon__> laptop has 32GB of RAM...
<fche> this could be one of those times where agentzh's separate-compilation-of-stap-modules could be necessary
<fche> for a probe job of that magnitude
<fche> could you try smaller, like say one tenth the size?
<simon__> yep, that's what I was thinking too... so I'm just making a script which manufactures a source file with n functions inside it :-)
<simon__> does the linux kernel do things differently, because surely it also has any many probe points, or more?
<fche> %stap -l 'kernel.function("*").call' | wc -l
<fche> 46373
<fche> (stap can probe numerically many more places than function calls, in principle)
<fche> (in practice, memory etc. constraints kick in)
<fche> my poor gcc has been running for five minutes, chugging a stap script for 'probe kernel.function("*") {}' ... shouldn't be that long
<fche> but memory consumption is okay for now
<simon__> I created a script which generates n functions which all call each other and finally the last one returns
<fche> try using not para-callgraph.stp but a simpler one - that doesn't print $$vars type things (which are so inherently context-specific and result in a lot of extra code being generated, to pull out and pretty-print all those local variables)
<simon__> however, para-callgraph.stp appears to fail if more than 63 functions deep are called... :-(
<simon__> this is where function bar100() gets called: 1403 zoo(24173): ->bar100 i=0x63
<simon__> but the very next line is: 1413 zoo(24173): <-bar63 return=0x64
<simon__> return lines for bar64() through bar100() are mysteriously missing...
<simon__> Do you know a way to increase the call depth? Or I need to write a simpler auto generate C program... :-)
<fche> hm, I can't think of a reason that shouldn't work
<simon__> I came up with a workaround script which generates n functions which are called one at a time instead...
<simon__> perl -e '$funcs=10; printf qq[#include <stdio.h>\n#include "trace.h"\n]; $i=1; while($i <= $funcs){ printf qq[int bar%d(int i){return(1 + i);}\n], $i; $i++; } printf qq[int main(void){ int i=0;\n]; foreach my $i(1..$funcs){ printf qq[ i = bar%d(i);\n], $i; } printf qq[ printf("- i=%%d\\n", i);\n}\n];' | tee zoo.c
<simon__> possibly a bug in stap? is there a test to test recursion depth bigger than 63?
<fche> stap doesn't experience recursion
<fche> it would experience a sequence of calls at deeper and deeper nesting levels, but stap itself doesn't know or care
<fche> again try a stap script that does not process the $$parms; it should drastically reduce resource requirements
<simon__> do you have such a script handy?
<simon__> so with 10k functions then this source file was generated: -rw-r--r-- 1 root root 69547889 Nov 21 18:32 stap_d59bb5f6406a3529e5d18985a6051fe1_10337070_src.c
<simon__> it's taking a long time to compile...
<simon__> looks like cc1 is using 2.5GB RAM so far...
<simon__> so 1,000 functions took real 0m18.938s, but 10,000 functions took real 5m22.487s
<simon__> so I'm wondering what's in that file because 69,547,889 / 10,000 is 6,954 bytes per function? Seems a bit on the steep side, or?
<fche> if it involves grabbing a bunch of different context variables and pretty-printing, that could be about right
<fche> if you grab the para-callgraph.stp script file, and replace $$parms with "" and $$return with "" then it won't mess with that
<simon__> I found out where it caches the .c file... :-)
<fche> sssh, tell no one, it's a secret
<simon__> So I guess it's not / 10,000 but really / 20,000 because it's the enter and leave points...
<fche> or you could run stap -k ...
<fche> yup, things add up
<simon__> so that's approx. 3,500 bytes per probe thingy...
<simon__> and there's lots AND LOTS of duplicated lines, e.g.: 20,000 * #define STAP_RETURN(v) do { STAP_RETVALUE = (int64_t) (v); goto out; } while(0)
<simon__> and that's just one of many examples...
<fche> those are harmless tho
<simon__> yy but the compile time starts to go through the roof...
<fche> not because of those
<fche> macros
<simon__> how can I get the actual compile command line for the monster C file?
<fche> stap --vp 02 ish .. stap -k will keep the tmp directory so you can run it for yourself later
<simon__> zoo.c with 10k functions takes 3.1 seconds for gcc to compile... but it looks like the much bigger .c file takes over 5 minutes... admittedly it is much bigger...
<fche> sorry stap --vp 0002 (pass 4 verbosity 2)
<simon__> thanks!
* fche must sign off shortly
<fche> a comfy pillow beckons
<fche> good luck dude and we can talk again tomorrow
<simon__> thanks for your help! and greetings from Vancouver, Canada! :-)
<fche> ah it's just late dinner time for you then
<fche> say hi to the whales in the fraser
<simon__> wow... you know your geography :-)
simon__ has quit [Ping timeout: 240 seconds]
khaled has joined #systemtap
khaled_ has joined #systemtap
khaled has quit [Ping timeout: 240 seconds]
ema_ is now known as ema
hpt has quit [Ping timeout: 265 seconds]
mjw has joined #systemtap
sscox has quit [Ping timeout: 245 seconds]
khaled_ has quit [Read error: Connection reset by peer]
khaled has joined #systemtap
khaled_ has joined #systemtap
khaled has quit [Ping timeout: 246 seconds]
wcohen has quit [Ping timeout: 276 seconds]
sscox has joined #systemtap
wcohen has joined #systemtap
simon__ has joined #systemtap
<simon__> hi again fche!
tromey has joined #systemtap
<fche> hey simon__ -morning
<simon__> hello! yes, morning 9:17 AM for me... but 12:17 PM for you?
<simon__> :-)
<fche> so what's new today
<simon__> fche, may I ask you some more questions about systemtap?
<simon__> yesterday I discovered that for the 10,000 function example, systemtap spends over 5 minutes creating and compiling its .c file which gets compiled to a .ko kernel object file... where can I find more info about the architecture of systemtap and how the .ko file fits into the big picture?
<fche> the docs include an introduction/architecture paper
<simon__> "Original architecture paper (July 2005)." is this the best one?
<fche> it's a good one to start
<fche> t
<fche> the concepts are the same
<simon__> thanks!
<simon__> another question: Yesterday we talked briefly about including and excluding functions in a runt-time call-tree trace. What about if I managed to instrument a large executable with ~ 80k functions and wanted to do something more complicated like: Have more control over the verbosity? Give some functions a higher or lower verbosity so that they are included or excluded in the trace depending upon the verbosity level? And also, let's say
<simon__> I would like to run the executable with verbosity switched off, but switch it on when a particular function is first executed? What are ways that you would approach these types of challenges with systemtap?
<fche> 'verbosity' within a script is entirely under your control
<fche> it's a programming language, eh?
<fche> so you print when you want to - you can track nesting levels, function names, time of day, whatever you want
<fche> and you decide when something should be printed
simon__ has quit [Ping timeout: 240 seconds]
khaled_ has quit [Ping timeout: 265 seconds]
khaled_ has joined #systemtap
simon__ has joined #systemtap
gromero_ has joined #systemtap
gromero has quit [Ping timeout: 240 seconds]
irker893 has joined #systemtap
<irker893> systemtap: sapatel systemtap.git:refs/heads/sapatel/pr22315 * release-4.1-101-ga07b09e / bpf-internal.h bpf-translate.cxx stapbpf/bpfinterp.cxx stapbpf/bpfinterp.h stapbpf/stapbpf.cxx tapset/bpf/exit.stp tapset/logging.stp: initial http://tinyurl.com/tj88j8r
<irker893> systemtap: sapatel systemtap.git:refs/heads/sapatel/pr22315 * release-4.1-102-gf8e7cd4 / bpf-internal.h bpf-translate.cxx stapbpf/bpfinterp.cxx tapset/bpf/exit.stp tapset/logging.stp: polishing http://tinyurl.com/sxjz26d
orivej has quit [Ping timeout: 250 seconds]
orivej has joined #systemtap
wcohen has quit [Ping timeout: 252 seconds]
simon__ has quit [Quit: Leaving]
wcohen has joined #systemtap
khaled_ has quit [Remote host closed the connection]
khaled has joined #systemtap
khaled has quit [Client Quit]
khaled has joined #systemtap
tromey has quit [Quit: ERC (IRC client for Emacs 26.1)]
sscox has quit [Ping timeout: 240 seconds]
amerey has quit [Quit: Leaving]
sscox has joined #systemtap