fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
khaled has quit [Quit: Konversation terminated!]
_whitelogger has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 246 seconds]
mjw has quit [Read error: Connection reset by peer]
mjw has joined #systemtap
mjw has quit [Remote host closed the connection]
hpt has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
wcohen has quit [Ping timeout: 244 seconds]
wcohen has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
tonyj has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
khaled has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 244 seconds]
orivej has quit [Ping timeout: 260 seconds]
lijunlong has quit [Ping timeout: 265 seconds]
lijunlong has joined #systemtap
fLiPr3VeRsE has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
hpt has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
fLiPr3VeRsE has joined #systemtap
amerey has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
tonyj has joined #systemtap
derek0883 has joined #systemtap
irker972 has joined #systemtap
<irker972>
systemtap: fche systemtap.git:azhang/pr13838 * release-4.3-89-g995846f21 / tapset/floatingpoint.stp: PR13838: drop unneeded formatting #defines from floatingpoint.stp
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
tromey has joined #systemtap
<kerneltoast>
good morning fche. i think i've got to the bottom of some of the issues agentzh and i are facing, and i wanted to ask you a bit about what we've found
<fche>
wow
<kerneltoast>
it only took a few weeks :D
<kerneltoast>
so, i noticed that in the reporting pass everywhere, every engine attached to a process is iterated through, and then the callback for each engine is run via start_callback()
<kerneltoast>
start_callback() takes a pointer to a report struct
<kerneltoast>
everywhere start_callback() is used in a loop on every engine, it is passed a pointer to a single on-stack report struct
<kerneltoast>
a prime example is in utrace_resume()
<kerneltoast>
a single report struct allocated on the stack is used for every engine's callback, but this causes some weird behavior
<kerneltoast>
each callback modifies report->action to indicate what it wants the target to do
<kerneltoast>
so what ends up happening is that each subsequent engine callback overwrites report->action with what it wants
<kerneltoast>
and the engine at the tail of the list gets the final say on what report->action is
<kerneltoast>
the problem is that not all engines agree on what they want the target to do
<kerneltoast>
some engines may want UTRACE_INTERRUPT, others may want UTRACE_STOP
<fche>
(keep going: dude, we should attach this whole observation set as a block comment into stp_utrace.c!!! please)
<kerneltoast>
when an engine is first attached in stap_start_task_finder(), the action it requests by default is UTRACE_STOP
<kerneltoast>
the engine will then request UTRACE_INTERRUPT, but things get hairy when a reporting pass occurs before that
<kerneltoast>
when a brand new engine is attached, and a reporting pass occurs before it can request UTRACE_INTERRUPT, it will end up requesting UTRACE_STOP to the target process during the reporting pass
<kerneltoast>
if this new engine is not at the tail of the attached list, it will just be ignored
<kerneltoast>
but when such an engine is at the tail of the list, its UTRACE_STOP request will go through
<fche>
(ISTR the kernel utrace had some sort of algebra system to combine the various engines' UTRACE_* judgements)
<kerneltoast>
and the target process will enter utrace_stop(), upon which it will never exit until it receives a SIGKILL
<fche>
(maybe the stp_utrace emulation doesn't do that part the same way as the original)
<kerneltoast>
so there are three issues here i think
<kerneltoast>
1. the last engine in the attached list decides what action the target process will take, because report->action gets overwritten
<kerneltoast>
2. the default action for an attached engine is UTRACE_STOP. i don't think we want the target process to stop when an engine gets attached to it
<kerneltoast>
3. the UTRACE_STOP state currently has no natural way of being exited. I think UTRACE_STOP as a whole is an incomplete feature, and i'm not sure what needs it
<fche>
UTRACE_INTERRUPT instead?
<kerneltoast>
either that, or we have the engine do nothing in the reporting passes until it actually requests something
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<fche>
kerneltoast,
<fche>
er we just kind of stopped there
<kerneltoast>
i'm not sure what to do
<fche>
aha
<kerneltoast>
there are a lot of different ways to address these
<kerneltoast>
and i have no idea which would be preferred :)
<fche>
aha
<fche>
re. 2, yeah I'm pretty sure a default of UTRACE_STOP is counterproductive
<fche>
lemme check ancient rhel6 original-utrace code
<kerneltoast>
UTRACE_STOP is also used in __stp_utrace_attach_match_filename()
<kerneltoast>
not sure why
<kerneltoast>
i also tried going through the git history to find incentives for these things, but all the old commits are just big code drops
<kerneltoast>
from that original patch: "An attached engine does nothing by default."
<kerneltoast>
so we should ignore newly attached engines in the reporting passes, until they request something, at least to follow the original intention
<kerneltoast>
that can just be achieved with another bit to the flags field of the engine
<fche>
they may not be able to request something unless they are reported to, not sure
<kerneltoast>
ah
<fche>
anyway
<fche>
it may be worth reading through that for more background info before deciding on a suggestion
<kerneltoast>
yeah. any idea where to find the old utrace flag algebra you mentioned?
<fche>
been trying to find that with a quick glance, lemme try harder :)
<fche>
it might be as simple as ... "the lowest number enum-value wins"