fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
amerey has quit [Quit: Leaving]
<fche> hey kerneltoast buddy, that patch makes this test case very happy, now running a larger suite
<kerneltoast> ooh
<kerneltoast> noice
<kerneltoast> still need to check that it won't drop messages upon module unload
<kerneltoast> via a partially filled subbuf
<fche> well, if a module is being unloaded, nothing is listening for messages, so dropping them is fine
<kerneltoast> the scenario i was thinking was filling the subbuf to the brim while the module is running, and then it's unloaded
<kerneltoast> an easy test case would be a single small printf actually
<kerneltoast> I'm not at my laptop right now but you can try that out
<kerneltoast> the printf needs to print something smaller than the size of a subbuf
<kerneltoast> basically, just see if a hello world stap module works
<fche> it does
<kerneltoast> noice
<kerneltoast> to confirm, you're using my second paste, right?
<kerneltoast> noice
<kerneltoast> that went better than expected
<fche> thoug
<fche> though this version appears to result in content being stuck in subbufs or whatnot, without waking up the userspace until later
<kerneltoast> yeah that's hard to deal with
<fche> so probe begin {log("hi")} ... doesn't print anything until a ^C or later
<fche> we dealt with that before
<kerneltoast> gotta deal with it differently now
<kerneltoast> maybe with a timer
<fche> isn't that _stp_relay_wakeup_timer ?
<kerneltoast> (still not at laptop)
<kerneltoast> a different approach would be to scan for empty subbufs
<kerneltoast> instead of quitting after a single swap
<kerneltoast> that would get rid of the pesky fudge aspect that'd come with having a timer enforce timely printing
<kerneltoast> I'm not sure how that would affect cross-subbuf ordering though
<fche> meaning cross-cpu absolute time ordering?
<kerneltoast> no, i mean if you have a single print that gets fragmented across different subbufs
<fche> those would still be sequential,surely
<kerneltoast> i think relay keeps track of this by having the subbufs ordered
<fche> yes
<kerneltoast> so if we just printed half a message in one subbuf, then skipped a subbuf and printed the rest in yet another subbuf, something would go funky i suspect
<kerneltoast> i think relay makes a subbuf unavailable after you swap it
<kerneltoast> and frees it back up once userspace consumes it
<kerneltoast> that must've exacerbated your test case
<kerneltoast> because userspace has to catch up with the log spam
<kerneltoast> fche, can you test the first patch i sent? this one: https://paste.centos.org/view/8da5796d
<kerneltoast> it doesn't suffer from the delay
<kerneltoast> I'm curious if it's good enough to pass your test
<fche> ok stand by
* kerneltoast gets behind blast shield
<fche> wonder why curl $URL | git apply is b0rked
<kerneltoast> might've been because i omitted those index lines git puts at the top
<fche> bad toast
<kerneltoast> otherwise i may reveal which old revision I'm using
<kerneltoast> and get exposed as a phony
<fche> ummm
<fche> ok that one works for 'probe begin {log("hi") } '
<fche> but
<fche> it seems to lose data or loop or something with the ruby bulk test
<fche> yeah no good my friend, this time stapio userspace is unkillable, looping
<fche> oh my
<fche> I'm going to have to do something
<kerneltoast> wowza
<kerneltoast> where did it all go wrong
hpt has joined #systemtap
<fche> maybe the twice-modifed size_request?
<fche> dunno
<kerneltoast> i think the first modification will never happen
<kerneltoast> wanna check if that if-statement gets hit?
<kerneltoast> if (unlikely(buf->offset == buf->chan->subbuf_size)) {
<kerneltoast> ^ that one
<fche> that sounds as though it will hit only if a subbuf magically turns out to be Exactly packed, not over?
<kerneltoast> yeah because the second modification makes sure we only ever exactly pack the subbuf
<kerneltoast> but i think it'll never happen because of the write commit function
<kerneltoast> subbuf is always swapped out after it's written to
<kerneltoast> so there's no way a lingering full subbuf could remain
<kerneltoast> in fact, each available subbuf is empty because of this
<kerneltoast> we never let a subbuf sit with data inside it
khaled has quit [Quit: Konversation terminated!]
<fche> alas, will think about it more tomorrow
<fche> the first patch was nicer, except for the not-enough-wakeups problem
<fche> it was stable & didn't lose data on that test
<kerneltoast> relaaaaayyyyyyy
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 256 seconds]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
orivej has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
orivej has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
orivej has joined #systemtap
derek0883 has quit [Remote host closed the connection]
khaled has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
hpt has quit [Ping timeout: 246 seconds]
mjw has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
orivej has quit [Ping timeout: 272 seconds]
tromey has joined #systemtap
orivej has joined #systemtap
<fche> kerneltoast, hey
<fche> when you're back, wondering if you had any more thoughts about a hybrid between patch1 & 2
<kerneltoast> still thonking about it
<kerneltoast> it's a dilemma for sure
<fche> is it the __stp_relay_switch_subbuf() that flags the timer to wake up userspace?
<kerneltoast> yeah
<fche> can the timer invoke that itself, for a 'dirty' subbuf?
<kerneltoast> the timer needs to be on the same cpu to do so
<fche> hm, so why not have one there
<fche> a timer per cpu
<fche> running just infrequently, say once per second
<kerneltoast> it's fudgy
<kerneltoast> it would work but it's not pretty
<fche> hey I have to look at the mirror every morning
<fche> and I work too
<fche> and I'm not pretty
<kerneltoast> whether or not you work is still under heavy academic debate
<fche> OOF
<fche> big oof
<kerneltoast> :P
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
amerey has joined #systemtap
<kerneltoast> we could let userspace handle subbuf swapping for subbufs which aren't full
<kerneltoast> if subbuf is full, we swap ourselves
<kerneltoast> if subbuf is partially full, we wake userspace and have userspace do the swap
<fche> whether userspace wakes up via its own timers
<fche> or whether a kernel timer wakes up periodically and wakes up userspace
<fche> doesn't seem to make much difference
<fche> still need one per cpu
<kerneltoast> yeah but it's less ghetto than waking up once per time quantum
<kerneltoast> since userspace needs to be woken up anyway
<fche> if the kernel timer checks frequentlyish, it does not need to wake up the userspace
<fche> if userspace timer needs to check frequentlyish, it's more costly in terms of context switches
<kerneltoast> lemme clarify: we still kick userspace from the kernel
<kerneltoast> but what we do is
<kerneltoast> always kick userspace from the write commit function
<kerneltoast> and only swap in the write commit function if the subbuf is full
<fche> so ISTM we should swap subbufs & kick userspace if EITHER (a) subbuf is full OR (b) on a timed basis even if the subbuf is not full
<kerneltoast> instead of a timed basis we can just use write commit
<fche> -every- write commit? don't we get the same problem from yesterday then? lost traffic due to excessive subbuf switching etc. ?
<kerneltoast> no because we'll only swap from write commit if the subbuf is full
<kerneltoast> if the subbuf isn't full, we leave it but wake up userspace
<fche> what can userspace do with a non-swapped non-full subbuf?
<kerneltoast> it can flush it
<kerneltoast> all printing is protected by disabled irqs
<fche> then we'd be storming userspace with wakeups -> flush requests
<kerneltoast> userspace won't read any incomplete prints
<kerneltoast> no it would be just as many wakeups as stap currently does
<kerneltoast> but we don't swap out the subbuf when we wakeup unless it is full
<fche> ok so again
<fche> if we have a stap probe that produces 10000 small prints per second per cpu
<fche> how many wakeups to userspace would that cause
<fche> how would userspace react
<kerneltoast> it would cause exactly the same number of wakeups as git stap
<kerneltoast> right now we have wakeups bonded together with subbuf swapping
<kerneltoast> i want to decouple them
<kerneltoast> lemme see if i can open up my laptop without waking the missus
<kerneltoast> the cat won't be screaming her into consciousness for another 50 min
<fche> don't tell me you're married to a cat
<fche> none of my business but
<kerneltoast> not married, i'm still a free range toast
<kerneltoast> our cat is an asshole and wakes us up at ~8 every day for food
<kerneltoast> this would also let us get rid of __stp_relay_wakeup_timer
<fche> I thought the wake_up_interruptible* goo was just not safe to invoke from _write_commit (arbitrary probe context)
<fche> that's why we bothered have timers
<kerneltoast> ah crap
<kerneltoast> i fell for the classic blunder
<kerneltoast> okay we can punt it onto __stp_relay_wakeup_timer
<kerneltoast> and do the same thing to avoid per-cpu timers
<kerneltoast> this will work better than per-cpu timers because it takes an arbitrary amount of time after the wakeup from the timer before the logger thread in userspace starts running
<kerneltoast> with per-cpu timers if you fire too frequently, you might exhaust the subbufs again
<kerneltoast> and telling how frequently that may be varies on the environment
<kerneltoast> what if I'm using stap on my amd geode
orivej has quit [Ping timeout: 272 seconds]
mjw has quit [Ping timeout: 240 seconds]
derek0883 has joined #systemtap
mjw has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek088_ has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
mjw has quit [Quit: Leaving]
derek088_ has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
irker485 has joined #systemtap
<irker485> systemtap: fche systemtap.git:master * release-4.4-56-g263f980e2 / NEWS: NEWS: correct arch names for recent tls code
derek0883 has quit [Read error: Connection reset by peer]
derek0883 has joined #systemtap
<irker485> systemtap: smakarov systemtap.git:master * release-4.4-57-g8a62fac08 / bpf-translate.cxx: stapbpf (for PR27030): bugfix the b71d20af bugfix
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<fche> kerneltoast, still around? still wondering about the timers
<kerneltoast> yeah
<kerneltoast> wazzap
<fche> I'm trying to process your objection to per-cpu wakeup timers
<kerneltoast> yes
<kerneltoast> what's got you confused?
<kerneltoast> other than me
<fche> istm we'd want per-cpu wakeup timers
<fche> ditch the central one
<fche> just make one per cpu
<fche> and it could implement the policy decision of how frequently to wake up userspace, at what levels of subbuf filledness
<fche> i.e., rapidly if there are full subbufs (in the case of a data dump)
<fche> or slower if there are nonempty nonfull subbufs (in the case of probe-begin dribble)
<kerneltoast> that might work but it'd be easy to break
<fche> how?
<kerneltoast> because the subbufs can be exhausted in the time between the timer decides to wake staprun to consume and the time that staprun actually wakes up and consumes
<kerneltoast> the scenario i'm thinking of:
<kerneltoast> 2. some milliseconds go by and the subbufs get filled
<kerneltoast> 3. staprun is now awake and starts consuming
<kerneltoast> 1. your timer does the wake_up_interruptible() to tell staprun to start consuming
<kerneltoast> and the "some milliseconds" varies depending on hardware speed
<fche> yes, ok, that as opposed to what?
<kerneltoast> my alternative:
<kerneltoast> 2. the partially filled subbuf can keep getting filled until staprun wakes up and consumes
<kerneltoast> 1. there's a print flush. a subbuf is partially filled but has lots of empty space. we don't swap the subbuf, but we still do wake_up_interruptible()
<kerneltoast> 3. staprun is now awake and starts consuming
derek0883 has quit [Remote host closed the connection]
<kerneltoast> this will require some userspace cooperation though
<fche> I thought we can't do a wake_up* from a print_flush for the same reason (general probe context)
<kerneltoast> because when staprun wakes up, it will need to tell the kernel module "hey i'm here now, swap your partially filled subbuf"
<kerneltoast> yes so instead we do the wakeup it from the relay timer
<fche> yes - so how is that substantially different?
<fche> except there being one timer vs. one per cpu ?
<kerneltoast> the main difference: the kernel module is not swapping a partially filled subbuf on its own. instead it waits for staprun to tell it to swap a partially filled subbuf
<kerneltoast> what you proposed was having per-cpu timers swap out the subbuf every X amount of time
<kerneltoast> so that data won't linger in the subbuf
<fche> I don't care specifically about the swapping aspect - if userspace threads can cause that once they wake up, fine with me
<kerneltoast> then you don't need percpu timers
<kerneltoast> because you're not using them to swap
<kerneltoast> swapping must occur on the cpu that owns the subbuf
<kerneltoast> if all you're doing is waking the userspace threads then there's no reason to have percpu timers
<fche> no locality advantage?
<kerneltoast> locality advantage for what?
<fche> instead of one thread that must scan N buffers and notify N userspace threads/fds
<fche> (and must fault all that stuff across cpus)
<fche> we could have N threads, one per cpu, looking at local data only
<kerneltoast> i don't see how that's helpful when there's already shared data used in the print path
<kerneltoast> and you don't need to scan N buffers
<fche> is it? thought we were per-cpu quite a lot
<kerneltoast> we still have the global lock
<kerneltoast> to avoid racing with print unregister
<kerneltoast> in the write commit function we can do this: cpumask_set_cpu(cpu, subbuf_flush_mask);
<kerneltoast> and then have __stp_relay_wakeup_timer go through every cpu in the mask
orivej has joined #systemtap
<fche> yeah and in case of pending data, fetch all that control stuff across numa/cpu
<kerneltoast> but that already happens when printing
<kerneltoast> via _stp_print_ctr
<fche> ok that's one more global, as opposed to all the subbuf counter/etc. stuff
<fche> anyway, I'm not saying the effect is bound to be large, but maybe some.
<fche> ok
<fche> so if we were to go your way,
<fche> what would we have to do
<kerneltoast> that should cover it
<kerneltoast> high quality design document right there
<fche> needs more punctuation
<fche> and at least one capital letter
<kerneltoast> and needs to end in .doc
<fche> that's just too far
<fche> xls
<kerneltoast> we need a flowchart too
<kerneltoast> one of my professors in college didn't let us write any code until we made a flowchart
<kerneltoast> it was for the intro to C programming class
<kerneltoast> i suffered
<fche> wait, is this hypothetical in the sense that it requires More changes to staprun or whatnot?
<fche> "send a command ... swap ?"
<kerneltoast> yes it does
<kerneltoast> staprun is going to wake up and there won't be any subbufs to consume
<kerneltoast> so it needs to tell the kernel to give it something to consume
<kerneltoast> staprun needs to be the one to invoke __stp_relay_switch_subbuf
<fche> not crazy about having to extend the staprun|kernel abi
<fche> (we've had, for quite a long time, mutual version compatibility)
<kerneltoast> hmm
<fche> how about a per-cpu quasi timer thing
<fche> namely:
<fche> - whenever a write_commit is complete, write a local timestamp into a local var
<fche> - if the subbuf has become full, and we're switching, set the cpumask bit
<fche> - if the subbuf is NOT full, but it has not been switched (prev timestamp too old), switch anyway and set the cpumask bit
tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]
<kerneltoast> that's no different than what you were thinking of before
<fche> well, it is, still only one timer Thread
<fche> and no staprun change
<kerneltoast> that still wastes a subbuf that can take more data in the time between the write commit and staprun waking up
<fche> that's ok, that second clause should trigger literally rarely
<kerneltoast> cooperation from staprun would allow maximum utilization of the subbufs
<fche> I can accept less than Maximum utilization
<fche> just the current git code appears to have much less than maximum
<kerneltoast> i dunno if that'll make your ruby test case pass
<kerneltoast> the patch you tested which lets non-full subbufs linger is really max utilization
<kerneltoast> can't get any better than that
<kerneltoast> another option could be tacking on a task worker to staprun
<kerneltoast> so that staprun code itself doesn't need to handle the cooperation
<kerneltoast> and instead the task worker can do it
derek0883 has joined #systemtap
<kerneltoast> the subbuf would get swapped right before the context switch to staprun code
<kerneltoast> and then staprun would stapconsume our stapbufs
<fche> can see why taht could be a way of eking out every last bit of storage, but don't think that's necessary
<kerneltoast> can you reduce the subbuf count a bit with my max utilization patch and see where the ruby tester breaks?
<kerneltoast> cut subbuf count in half, see what happens
<kerneltoast> if you can reduce the subbuf count very much then we'll need moar subbufs to cope with *not* eking out every last bit of storage
<kerneltoast> *if you can't reduce subbuf count very much
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
amerey has quit [Remote host closed the connection]