fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 272 seconds]
orivej_ has quit [Ping timeout: 248 seconds]
orivej has joined #systemtap
hpt has joined #systemtap
wcohen has quit [Ping timeout: 265 seconds]
wcohen has joined #systemtap
sanoj has joined #systemtap
sanoj has quit [Quit: Leaving]
sanoj has joined #systemtap
nkambo has joined #systemtap
slowfranklin has joined #systemtap
irker727 has quit [Quit: transmission timeout]
sanoj has quit [Ping timeout: 256 seconds]
sanoj has joined #systemtap
slowfranklin has quit [Quit: slowfranklin]
hpt has quit [Quit: Lost terminal]
slowfranklin has joined #systemtap
slowfranklin has quit [Client Quit]
wcohen has quit [Ping timeout: 265 seconds]
CME has quit [Ping timeout: 264 seconds]
CME has joined #systemtap
wcohen has joined #systemtap
slowfranklin has joined #systemtap
pwithnall has joined #systemtap
pwithnall has quit [Read error: Connection reset by peer]
pwithnall has joined #systemtap
pwithnall has quit [Client Quit]
mjw has joined #systemtap
nkambo has quit [Ping timeout: 240 seconds]
nkambo has joined #systemtap
gila has quit [Quit: My Mac Pro has gone to sleep. ZZZzzz…]
gila has joined #systemtap
fche has quit [Read error: error:1408F10B:SSL routines:ssl3_get_record:wrong version number]
fche has joined #systemtap
fche has quit [Changing host]
fche has joined #systemtap
fche has quit [Read error: error:1408F10B:SSL routines:ssl3_get_record:wrong version number]
fche has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
slowfranklin has quit [Quit: slowfranklin]
slowfranklin has joined #systemtap
scox has quit [Ping timeout: 248 seconds]
gila has quit [Quit: My Mac Pro has gone to sleep. ZZZzzz…]
gila has joined #systemtap
slowfranklin has quit [Quit: slowfranklin]
gromero has joined #systemtap
slowfranklin has joined #systemtap
slowfranklin has quit [Client Quit]
sanoj has quit [Ping timeout: 255 seconds]
pwithnall has joined #systemtap
orivej has joined #systemtap
pwithnall has quit [Read error: Connection reset by peer]
nkambo has quit [Ping timeout: 265 seconds]
pwithnall has joined #systemtap
pwithnall has quit [Read error: Connection reset by peer]
mbenitez has joined #systemtap
mbenitez has quit [Changing host]
mbenitez has joined #systemtap
wcohen has quit [Ping timeout: 265 seconds]
nkambo has joined #systemtap
scox has joined #systemtap
wcohen has joined #systemtap
brolley has joined #systemtap
drsmith_away has left #systemtap [#systemtap]
drsmith_away has joined #systemtap
slowfranklin has joined #systemtap
pwithnall has joined #systemtap
pwithnall has quit [Read error: Connection reset by peer]
tromey has joined #systemtap
gila has quit [Quit: My Mac Pro has gone to sleep. ZZZzzz…]
<toothe> I really like systemtap, but I keep getting problems
<toothe> now its not even starting.
<fche> bummer, are you getting error messages?
<toothe> when I use -v, I can see that it gets to stage 5. But then never progresses.
<fche> are you sure it's not started?
slowfranklin has quit [Quit: slowfranklin]
<toothe> possibly not - but I was expecting to see a ton of messages. Let me show
<fche> maybe your script just doesn't probe things that are actually run
<fche> (try stap -t .... to get a hit-count for each probe)
<toothe> right. I could be mistaken. I apologize...
<fche> hey no problem, am curious what's up too
<toothe> okay - just a moment...
<toothe> and now its working perfectly fine....very odd.
<toothe> i think i was wrong
<toothe> okay...well...it works now.
<toothe> no idea what changed from last evening.
<fche> this whole area is still sort of unsettled science - stap lets people express instrumentation paths that haven't all been trod upon by kernel folks
<fche> so there can be latent problems here and there
<fche> (once in a while, in stap, other times in the kernel)
<toothe> wonder if it works on Ubuntu yet.
slowfranklin has joined #systemtap
<toothe> so, I ran into one issue
<toothe> my instruments are happening all at once, so rather than following one kernel thread, I'm getting multiple triggernig at once.
<toothe> is there a way to only follow a single kernel thread? :)
<fche> you can put an if (pid() == ....) or (cpu() == ...) or such conditional in
<toothe> yes...
<toothe> good call.
<toothe> well, I did that. I actually got a lot of repeat values.
<fche> that will happen if you have a loop, or a multithreaded program with same pid() perhaps
<toothe> what about tid?
<fche> tid() is available too.
<toothe> odd - my code doesn't seem to end.
<fche> should it? is there an exit()?
<toothe> let me share my code.
<toothe> notice it starts at _rtl_pci_interrupt
<fche> yup
<fche> (btw you don't have to initialize globals - or any variables - to zero)
<toothe> I was expecting once the pid and tid change, the program would stop (not exit)
<fche> (the (mytid == tid()) test should subsume the other two.
<toothe> ahh
<toothe> so...why am I getting code clearly out of order?
<fche> what is out of order?
<toothe> so...if the function funcA() is run, then funcB() should run
<toothe> i'll get funcA(), then funcQ(), then functB()
<fche> maybe compiler inlining again?
<toothe> perhaps?
<fche> maybe more than one rtl_pci_interrupt hit? so mypid/mytid got updated ?
<toothe> but, I am getting cpu() that is not correct.
<toothe> right - but sholdn't I only have 1?
<toothe> shouldn't only a single kernel thread run?
<fche> at a time, on a cpu, yes ... but if it's an smp or something gets preempted, no
<fche> mypid/mytid are -global- to the stap script
<toothe> hm...so how do I do this?
<fche> you just want to pick -one- 'winner' and block others?
<toothe> yes.
<fche> in the rtl_pci_interrupt probe, put a conditional if (start) next;
<fche> at the top
<fche> so there can be only one (tm)
<toothe> I don't follow?
<fche> probe ... ("..._rtl_pci_interrupt") { if (start) next; start=1; mypid=pid(); mytid=tid() }
<fche> that way once set, those variables will be unchanged
<fche> ('next' is like return from a probe)
* toothe gives that a shot
<toothe> i don't follow 'next' yet.
<toothe> and, I think I'm diong the other part.
<fche> which is the other part?
<toothe> oh wait...
<toothe> okay, I'm just being a bad programmer....
<fche> I WILL ALLOW IT
<toothe> what does 'next' do ?
<toothe> is that the equivalent of 'return' ?
<fche> 'like a return from a probe' :)
<fche> it's in [man stap]
<fche> will clarify in [man stapref]
<toothe> my code was just borked
<toothe> nvmd heh.
<fche> DONE
<toothe> i was expecting the captures to eventuall stop.
<toothe> i know, i still initialize the variables, but that's just a pesronal thing.
<toothe> i was expecting this to stop capturing after a while. I don't see hat I'm doing wrong.
<fche> np, it's harmless
<fche> when would you like it to stop?
<toothe> I suppose until the end of this particular CPU thread?
irker708 has joined #systemtap
<irker708> systemtap: fche systemtap.git:refs/heads/master * release-3.2-75-g91bb795 / man/stapref.1: stapref.1: explain 'next' and 'return' - in \fIitalics\fP http://tinyurl.com/y9yephfm
<toothe> I would at least be able to trace on particular thread cleanly.
<toothe> right now, multiple threads are mixed up.
<toothe> even if it captures multiple at once, that's fine. I just need a wayto distinguish between threads/CPUs
<fche> you'd have to trap that thread's death, and maybe reset your 'start' state? dunno, or just exit()
<toothe> sure.
<fche> you can always -print- tid() etc.
<toothe> or at least distinguish between two threads?
<toothe> but look - here, I am clealy filtering by the tid(), no?
<fche> yes you are
<fche> so not sure how they are being mixed up
<fche> but might as well print something, ... just to help debug the script
<toothe> hm...
<toothe> fair.
gila has joined #systemtap
<fche> (btw --suppress-time-limits includes the overload-related facilities, if the man page is telling the truth)
<toothe> I don't recall why I used that...
<toothe> yeah, I"m still getting mixed up probes.
<toothe> in random orders.
<toothe> I'm stuck.
<toothe> In this driver, an IRQ interrupt happens at any random moment.
<toothe> I want know what happens after one specific interrupt is triggered - and follow that kernel thread to the end.
<toothe> I keep getting clearly random threads's code execution.
<toothe> and/or stuff seems to be dropped.
<fche> maybe just trace all threads, separate the thread records in post-processing?
<toothe> how? :)
<fche> can sort by the tid() key columN?
<toothe> why is my cpu and thread id almsot always 0 ?
<toothe> even with the same thread id, same cpu id and same pid, a function gets called multiple times in a row
<toothe> how is that possible?
<toothe> and the thread id and cpuid are always 0
<fche> is the function called multiple times from some routine?
<toothe> no.
<fche> (are you sure?)
<toothe> lol, sure - I can show the code.
<toothe> Its calling line 1035.
<toothe> the function _rtl_pci_rx_interrupt is called multiple times ni a row.
<fche> yeah, one per interrupt event ?
<toothe> yes.
<toothe> that will only happen once
<toothe> in fact, that is the interrupt handler code.
<toothe> so, unless there are multiple times that is triggering.
<toothe> But they're all still on the same CPU id, thread id ,and what have you.
<fche> the function("*") probe will -also- fire for _rtl_pci_interrupt
<fche> and note you're not filtering that one
<toothe> i am...no? let me show code.
<toothe> the cpu() is 1, but the mypid and mytid are always 0.
<fche> that's not unusual - pid/tid 0 is the kernel idle thread IIRC
<fche> that's the guy that is often running when hardware interrupts happen to happen
<toothe> fair. Then...how do I filter this? :)
<fche> I'd have to understand what's wrong first :)
<fche> that idle thread won't die
<fche> and the hw interrupt could hit it repeatedly
<fche> (or some other thread, possibly, depending on how busy the machine is)
<toothe> right. But doesn't the start = 1 prevent that from mattering?
<toothe> oh wait...unless they share thhh same thread/cpu/pid.
<toothe> hm...then I'm confused.
<fche> yeah. so then all the filtering predicates in the other probes will be successful ever after
<toothe> im stuck then heh
<fche> well no, you just haven't defined the problem well :)
<toothe> fair enough lol
<fche> need a clear definition of 'begin' and 'end'
<toothe> sure.
<toothe> So, I have this wifi card that is constantly receiving data.
<toothe> I want to know what happens when an interrupt is triggered.
<fche> ikt could be that probe functon("_rtl_pci_interrupt").return is the 'end'
<toothe> Specifically, I want each instance of the CPU triggger to be mapped out in a readible way.
<toothe> "mapped out" -- meaning, traced.
<toothe> possibly...
<fche> right so the question is how to identify the 'cpu trigger' beginning & end.
<toothe> so, I just added this: probe module("rtl_pci").function("_rtl_pci_interrupt").return { exit(); }
<toothe> btw, can this be done from the commandline instead of editing a textfile?
mbenitez has quit [Quit: Leaving]
mbenitez has joined #systemtap
mbenitez has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
<fche> to add another probe? yessir
<fche> stap -E 'another_script_fragment' foo.stp
slowfranklin has quit [Quit: slowfranklin]
mbenitez has quit [Quit: Leaving]
tromey has quit [Quit: ERC (IRC client for Emacs 26.0.90)]
<irker708> systemtap: dsmith systemtap.git:refs/heads/master * release-3.2-76-g64368c6 / httpd/backends.cxx: Perform some docker-related cleanup in the web service. http://tinyurl.com/ydgdxljl
orivej has joined #systemtap
scox has quit [Ping timeout: 248 seconds]
wcohen has quit [Remote host closed the connection]
gila has quit [Quit: My Mac Pro has gone to sleep. ZZZzzz…]
brolley has left #systemtap [#systemtap]
wcohen has joined #systemtap
mjw has quit [Quit: Leaving]