fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
scox has joined #systemtap
adgud has joined #systemtap
<adgud>
it's me again :) why probe timer.profile counting userspace tasks with user_mode() reports more kernel tasks than user tasks, while vmstat CPU util show more %us than %sy?
<adgud>
its like completely inverse: in stap 10% us, 60% sy; in vmstat 40% us, 6% sy
pwithnall_ has joined #systemtap
pwithnall has quit [Ping timeout: 258 seconds]
pwithnall_ has quit [Client Quit]
zodbot has quit [Disconnected by services]
zodbot has joined #systemtap
hpt has joined #systemtap
hkshaw has joined #systemtap
flos has joined #systemtap
Humble has quit [Ping timeout: 255 seconds]
<fche>
the kernel accounting logic accounts for time only at particular scheduling points; it -might- (not sure) credit some types of kernel activities as something other than sys% for example
<fche>
the stap script counts more directly, but is limited to sampling intermittently. if the user/sys transitions occur much more frequently than the stap timer.profile sampling rate, both sets of numbers can be "right" but still be different
hkshaw has quit [Ping timeout: 240 seconds]
irker205 has quit [Quit: transmission timeout]
hkshaw has joined #systemtap
<adgud>
but documentation says that "Profiling timers are available to provide probes that execute on all CPUs at each system tick"
<adgud>
that would make timer.profile more accurate than other tools, wouldn't it?
<adgud>
I thought that probing at every system tick is as accurate as it gets
<fche>
system tick ~= 100 Hz
<adgud>
well, then context switches may occur much more often
<fche>
so sampling the system state at that frequency is accurate -if- there were no sampling artifacts - if system behavior did not correlate with system ticks e.g.
<fche>
exactly
CME has quit [Ping timeout: 256 seconds]
CME has joined #systemtap
<adgud>
would running timer at 1 nanosecond alleviate this and make the results more accurate?
<adgud>
or is it an overkill and other problems would arise?
<fche>
definitely other problems
<fche>
btw see also [man tapset::task_time]
<adgud>
yeah, timer.ms(1) gives me no user task time at all...
<fche>
that's more because the .ms() timers are invoked from kernel space software timers, so by definition show up as kernel-space events
<adgud>
oh well then my whole approach is completely flawed
<fche>
wouldn't go that far ... but it's not right to expect it to match the differently-measured numbers
<fche>
timer.profile is a good choice (or a finer-resolution perf.* type event)
<adgud>
but I can't collect CPU time distribution with it, and that was my goal
<adgud>
(I wanted to compare vmstat with stap and see if the results are different, that's why I've been asking those question for the last couple of days)
<fche>
understood
<fche>
one thing to keep in mind is that vmstat etc. aren't gospel either - they also estimate / define those times with their own idiosyncrasies
<fche>
stap's profile/sample based one is also pretty well defined
<fche>
for processes that are mostly cpu bound, they should roughly match
<fche>
for processes that are highly kernel-interactive, they may not match. and actually those are interesting cases. the stap sampling one may even give more meaningful results
<fche>
assuming the sampling times are not correlated with the program behaviour
<adgud>
oh I see, I expected differences but not that big; this is quite an explanation
<adgud>
that would totally explain (I think so, at least) why qemu+kvm make like 90% kernel ticks, even though qemu runs in userpace, I am correct?
<adgud>
qemu relays most of stuff to kvm, which operates at kernel level
hkshaw has quit [Ping timeout: 276 seconds]
<fche>
kvm kernel side intercepts only certain privileged operations, like i/o and paging and some cpu debugging stuff
<fche>
most of it runs in userspace ideally
<fche>
but qemu/kvm may have events that correlate tightly with the host's normal profiling timer (such as having its own profiling timer)
<fche>
so sampled results could lead to false results
<fche>
should probably try with different .hz type profiling/perf event sources
* fche
must head off now, good luck , see you tomorrow
<adgud>
see you, have a good day
ravi_ has joined #systemtap
flos has quit [Ping timeout: 255 seconds]
adgud has quit [Ping timeout: 240 seconds]
flos has joined #systemtap
ravi_ has quit [Ping timeout: 260 seconds]
ravi_ has joined #systemtap
Humble has joined #systemtap
hkshaw has joined #systemtap
drsmith_away has quit [Ping timeout: 255 seconds]
drsmith_away has joined #systemtap
pwithnall has joined #systemtap
hpt has quit [Quit: Lost terminal]
pwithnall_ has joined #systemtap
pwithnall_ has quit [Client Quit]
flos has quit [Quit: Those who know don't tell.]
ravi_ has quit [Ping timeout: 240 seconds]
adgud has joined #systemtap
hkshaw has quit [Ping timeout: 255 seconds]
adgud has quit [Ping timeout: 255 seconds]
adgud has joined #systemtap
adgud has quit [Ping timeout: 240 seconds]
hkshaw has joined #systemtap
groleo has quit [Ping timeout: 276 seconds]
mjw has joined #systemtap
Humble has quit [Quit: Leaving]
Humble has joined #systemtap
hkshaw has quit [Ping timeout: 240 seconds]
flos has joined #systemtap
wcohen has quit [Remote host closed the connection]