fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast> because they use prints and then read the log buffer directly
<kerneltoast> it's hacky
<fche> yes it's hacky
<kerneltoast> but they're reentrant anyway because they require a context* pointer
<fche> and I thought you had a variant that prints into the stap context string directly rather than the regular print buffer
<fche> but even apart from that
<kerneltoast> no i never did
<fche> ok
hpt has joined #systemtap
<fche> but still that doesn't seem like an exception from normal reentrancy policy ("prevent reentrancy")
<fche> if a backtracing function messes with the print buffer, it should not matter as it's run from a non-reentrant (architecturally intended) context
<fche> so no one else should be able to start mucking with that particular buffer
<kerneltoast> yeah exactly
<kerneltoast> so we could skip the reentrancy lock for them
<kerneltoast> and make them work again
<fche> as long as -something- is reliably preventing reentrancy!
<kerneltoast> yeah, which the context is
<kerneltoast> unless it's not and that's why the probe lock soft lockup happens :p
<fche> ok
<fche> well, at least as a part of this discussion, we both got clear that reentrancy is not a desirable capability really
<fche> and I don't think it ever was
<fche> so if we can reliably exclude that possibility, and simplify logic as we go, that would be good
<kerneltoast> yeah I'd probably implement it by adding a context pointer argument to the lock functions
<kerneltoast> if you have a context pointer, you don't need the reentrancy protection we're selling
<fche> yeah, maybe something like that
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
_whitelogger has joined #systemtap
<kerneltoast> there is a way to check if we're in an NMI but only on x86...
<kerneltoast> who needs stap on arm anyway
derek0883 has quit [Remote host closed the connection]
khaled has quit [Quit: Konversation terminated!]
khaled has joined #systemtap
derek0883 has joined #systemtap
_whitelogger has joined #systemtap
<agentzh> kerneltoast: we need aarch64 at least :)
_whitelogger has joined #systemtap
_whitelogger has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast> heh yeah
<kerneltoast> this is just very painful to get around without some way of telling if we're inside an NMI
<kerneltoast> factoring out the reentrancy used for prints would be very messy
<kerneltoast> and easy to break
<kerneltoast> i might try and check to see if only the perf tracepoint is used inside an NMI
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
derek0883 has quit [Remote host closed the connection]
hpt has quit [Ping timeout: 272 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
mjw has joined #systemtap
<fche> I don't see why we need to check whether we're in an nmi, as opposed to checking whether a cpu-exclusive lock is already taken - i.e., reentrancy of any sort, softirq exception whatever
amerey has joined #systemtap
<kerneltoast> fche, because factoring out the reentrancy that already exists is really messy
<kerneltoast> I tried it yesterday
<kerneltoast> It starts off okay
<kerneltoast> Then it kind of explodes
<kerneltoast> especially with the _stp_vlog usage
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
<kerneltoast> fche, i think i have a better idea
<kerneltoast> most prints are inside probes which have the context locked
<kerneltoast> there are a few outside of there for module exit n stuff
<kerneltoast> we can take the log's reentrancy lock if the context isn't held
<kerneltoast> and the prints outside of the probes are simple and don't need reentrancy
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<fche> sounds plausible
derek0883 has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
<kerneltoast> fche, testsuite is running with my new idea
<kerneltoast> it depends on another patch i made, which i think you'll like
<kerneltoast> ouch, the preliminary patch failed quickly
<kerneltoast> percpu: allocation failed, size=6768 align=32 atomic=0, alloc from reserved chunk failed
<kerneltoast> fche, i want to check if the runtime context is held, but _stp_runtime_get_context() can return NULL. if the context is already held, is _stp_runtime_get_context() guaranteed to return non-NULL?
<kerneltoast> my assumption is no
<fche> if it's already held, it should return NULL
<fche> it should preclude reentrancy
<fche> that's part of its purpose.
<kerneltoast> no that's a different function
<kerneltoast> all i want to do is read &c->busy
<kerneltoast> i don't want to modify it
<fche> why not call _stp_runtime_entryfn_get_context
<kerneltoast> because then we'll modify the runtime context state and potentially block a probe inside an IRQ from working
<kerneltoast> i can fixup my preliminary patch a bit
<fche> entryfn_get_context should do that already
<kerneltoast> yeah i mean i don't want to block anything by grabbing the runtime context myself
<fche> it's okay to do so, saves you from a TOCTOU anyway
<kerneltoast> but what if i grab it and then before i release it, an NMI comes flying outta nowhere and tries to run some probes
<kerneltoast> TOCTOU isn't an issue here i think because this is all running on the same CPU
<fche> fine, the nmi ones will fail, no problem.
<kerneltoast> haha
<kerneltoast> but it can be fixed!
<kerneltoast> i can make my first patch better
<kerneltoast> with the 'ol read_trylock
<kerneltoast> whaddya say?
<fche> well lemme see the patches.
<kerneltoast> ok lemme write it up
<fche> de-rcu-ification of the context stuff is good in general, /me is eager
<fche> yeah that first patch looks good
<fche> and I don't see a reason why nmi- or whatever code wouldn't use the entryfn_get/put_context pair to not just check but reserve the context for the duration of some critical section
<fche> BY THE WAY
<fche> the context could also be a place to put a print buffewr
<fche> just saying
<kerneltoast> yeah the only exception to the context being held for NMI stuff is if there's a print occuring outside the context protection, and then an NMI strikes and corrupts the log data
<kerneltoast> there are some prints in stap that occur without the context being held
<kerneltoast> like in systemtap_module_exit
<fche> ok, that's super late, but yeah we could grab a context there too
<kerneltoast> i suppose i could leave in the reentrancy lock as a backup measure then
<fche> if we use the _get_context gadget, it can BE the reentrancy lock
<kerneltoast> but i dunno about all the weird places a print could occur
<kerneltoast> also, what if we try to acquire a context in systemtap_module_exit and it fails?
<kerneltoast> do we spin?
<fche> then we don't print stuff
<kerneltoast> ah
<fche> but by that time I think the probes are shut down or shutting down, in practice it probably can't happen
<kerneltoast> do you know of any other prints that occur outside of a probe?
<fche> not off the top of my head
<kerneltoast> that's my worry :)
<fche> but we could let the compiler find them for us ... changing the printing api to take a context* as a parameter
<kerneltoast> i thought about that too and worried that we'd break stap scripts
<fche> how?
<kerneltoast> wouldn't all the prints inside stap scripts need to have the context passed as an argument?
<fche> stap scripts don't know about contexts
<fche> embedded-c code inside stap scripts can't call arbitrary runtime functions generally
<fche> there is a STAP_PRINT macro, which we could arrange to get the c* propagated.
<kerneltoast> so i guess we'd break some embedded C
<fche> I think probably not
<kerneltoast> fche, i'm pumping this rcu cleanup through the testsuite: https://gist.github.com/kerneltoast/89935e70335a200e51c2033b044a32b3
<kerneltoast> refactoring the print api to take a context pointer will take a bit of time
<fche> I am not sure we need that stp_context_lock / context_stop doodaad
<fche> it's probably harmless
<kerneltoast> if we don't need it then RCU was never needed
<fche> that stp_runtime_context_free bit is only called when the system knows that no more probes are running, IIRC
<fche> could be that RCU was always unnecessary in this particular context
<fche> er
<kerneltoast> * _stp_ctl_work_callback may still be running and looking for contexts.
<kerneltoast> /* We should be free of all probes by this time, but for example the timer for
<fche> excuse the context context pun
<kerneltoast> that's the comment on top of _stp_runtime_contexts_free
<fche> hmmmmmmm
<fche> I'd think we free context way way down the line, when probes and timers and tracepoints are all already cleansed
<kerneltoast> the timer in question is stopped via a file write from userspace
<kerneltoast> but yeah if you wanna YOLO it and nuke the synchronization we can do that
<kerneltoast> i can also test it from my end with printks to see if the lock doodaad is ever contented
<agentzh> fche: ah, i just found stap does not support the perl regex syntax...pity.
derek0883 has quit [Remote host closed the connection]
<fche> I'm not opposed to it as a belt-and-suspenders measure, but generally I'd expect it not to fire like that
derek0883 has joined #systemtap
mjw has quit [Quit: Leaving]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
orivej has joined #systemtap
amerey has quit [Quit: Leaving]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 246 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
thibaultcha is now known as chasum
derek0883 has joined #systemtap