fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
amerey has quit [Quit: Leaving]
<fche>
hey kerneltoast buddy, that patch makes this test case very happy, now running a larger suite
<kerneltoast>
ooh
<kerneltoast>
noice
<kerneltoast>
still need to check that it won't drop messages upon module unload
<kerneltoast>
via a partially filled subbuf
<fche>
well, if a module is being unloaded, nothing is listening for messages, so dropping them is fine
<kerneltoast>
the scenario i was thinking was filling the subbuf to the brim while the module is running, and then it's unloaded
<kerneltoast>
an easy test case would be a single small printf actually
<kerneltoast>
I'm not at my laptop right now but you can try that out
<kerneltoast>
the printf needs to print something smaller than the size of a subbuf
<kerneltoast>
basically, just see if a hello world stap module works
<fche>
it does
<kerneltoast>
noice
<kerneltoast>
to confirm, you're using my second paste, right?
<fche>
though this version appears to result in content being stuck in subbufs or whatnot, without waking up the userspace until later
<kerneltoast>
yeah that's hard to deal with
<fche>
so probe begin {log("hi")} ... doesn't print anything until a ^C or later
<fche>
we dealt with that before
<kerneltoast>
gotta deal with it differently now
<kerneltoast>
maybe with a timer
<fche>
isn't that _stp_relay_wakeup_timer ?
<kerneltoast>
(still not at laptop)
<kerneltoast>
a different approach would be to scan for empty subbufs
<kerneltoast>
instead of quitting after a single swap
<kerneltoast>
that would get rid of the pesky fudge aspect that'd come with having a timer enforce timely printing
<kerneltoast>
I'm not sure how that would affect cross-subbuf ordering though
<fche>
meaning cross-cpu absolute time ordering?
<kerneltoast>
no, i mean if you have a single print that gets fragmented across different subbufs
<fche>
those would still be sequential,surely
<kerneltoast>
i think relay keeps track of this by having the subbufs ordered
<fche>
yes
<kerneltoast>
so if we just printed half a message in one subbuf, then skipped a subbuf and printed the rest in yet another subbuf, something would go funky i suspect
<kerneltoast>
i think relay makes a subbuf unavailable after you swap it
<kerneltoast>
and frees it back up once userspace consumes it
<kerneltoast>
that must've exacerbated your test case
<kerneltoast>
because userspace has to catch up with the log spam
<kerneltoast>
this would also let us get rid of __stp_relay_wakeup_timer
<fche>
I thought the wake_up_interruptible* goo was just not safe to invoke from _write_commit (arbitrary probe context)
<fche>
that's why we bothered have timers
<kerneltoast>
ah crap
<kerneltoast>
i fell for the classic blunder
<kerneltoast>
okay we can punt it onto __stp_relay_wakeup_timer
<kerneltoast>
and do the same thing to avoid per-cpu timers
<kerneltoast>
this will work better than per-cpu timers because it takes an arbitrary amount of time after the wakeup from the timer before the logger thread in userspace starts running
<kerneltoast>
with per-cpu timers if you fire too frequently, you might exhaust the subbufs again
<kerneltoast>
and telling how frequently that may be varies on the environment
<kerneltoast>
what if I'm using stap on my amd geode
orivej has quit [Ping timeout: 272 seconds]
mjw has quit [Ping timeout: 240 seconds]
derek0883 has joined #systemtap
mjw has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek088_ has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
mjw has quit [Quit: Leaving]
derek088_ has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<fche>
kerneltoast, still around? still wondering about the timers
<kerneltoast>
yeah
<kerneltoast>
wazzap
<fche>
I'm trying to process your objection to per-cpu wakeup timers
<kerneltoast>
yes
<kerneltoast>
what's got you confused?
<kerneltoast>
other than me
<fche>
istm we'd want per-cpu wakeup timers
<fche>
ditch the central one
<fche>
just make one per cpu
<fche>
and it could implement the policy decision of how frequently to wake up userspace, at what levels of subbuf filledness
<fche>
i.e., rapidly if there are full subbufs (in the case of a data dump)
<fche>
or slower if there are nonempty nonfull subbufs (in the case of probe-begin dribble)
<kerneltoast>
that might work but it'd be easy to break
<fche>
how?
<kerneltoast>
because the subbufs can be exhausted in the time between the timer decides to wake staprun to consume and the time that staprun actually wakes up and consumes
<kerneltoast>
the scenario i'm thinking of:
<kerneltoast>
2. some milliseconds go by and the subbufs get filled
<kerneltoast>
3. staprun is now awake and starts consuming
<kerneltoast>
1. your timer does the wake_up_interruptible() to tell staprun to start consuming
<kerneltoast>
and the "some milliseconds" varies depending on hardware speed
<fche>
yes, ok, that as opposed to what?
<kerneltoast>
my alternative:
<kerneltoast>
2. the partially filled subbuf can keep getting filled until staprun wakes up and consumes
<kerneltoast>
1. there's a print flush. a subbuf is partially filled but has lots of empty space. we don't swap the subbuf, but we still do wake_up_interruptible()
<kerneltoast>
3. staprun is now awake and starts consuming
derek0883 has quit [Remote host closed the connection]
<kerneltoast>
this will require some userspace cooperation though
<fche>
I thought we can't do a wake_up* from a print_flush for the same reason (general probe context)
<kerneltoast>
because when staprun wakes up, it will need to tell the kernel module "hey i'm here now, swap your partially filled subbuf"
<kerneltoast>
yes so instead we do the wakeup it from the relay timer
<fche>
yes - so how is that substantially different?
<fche>
except there being one timer vs. one per cpu ?
<kerneltoast>
the main difference: the kernel module is not swapping a partially filled subbuf on its own. instead it waits for staprun to tell it to swap a partially filled subbuf
<kerneltoast>
what you proposed was having per-cpu timers swap out the subbuf every X amount of time
<kerneltoast>
so that data won't linger in the subbuf
<fche>
I don't care specifically about the swapping aspect - if userspace threads can cause that once they wake up, fine with me
<kerneltoast>
then you don't need percpu timers
<kerneltoast>
because you're not using them to swap
<kerneltoast>
swapping must occur on the cpu that owns the subbuf
<kerneltoast>
if all you're doing is waking the userspace threads then there's no reason to have percpu timers
<fche>
no locality advantage?
<kerneltoast>
locality advantage for what?
<fche>
instead of one thread that must scan N buffers and notify N userspace threads/fds
<fche>
(and must fault all that stuff across cpus)
<fche>
we could have N threads, one per cpu, looking at local data only
<kerneltoast>
i don't see how that's helpful when there's already shared data used in the print path
<kerneltoast>
and you don't need to scan N buffers
<fche>
is it? thought we were per-cpu quite a lot
<kerneltoast>
we still have the global lock
<kerneltoast>
to avoid racing with print unregister
<kerneltoast>
in the write commit function we can do this: cpumask_set_cpu(cpu, subbuf_flush_mask);
<kerneltoast>
and then have __stp_relay_wakeup_timer go through every cpu in the mask
orivej has joined #systemtap
<fche>
yeah and in case of pending data, fetch all that control stuff across numa/cpu
<kerneltoast>
but that already happens when printing
<kerneltoast>
via _stp_print_ctr
<fche>
ok that's one more global, as opposed to all the subbuf counter/etc. stuff
<fche>
anyway, I'm not saying the effect is bound to be large, but maybe some.