fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
khaled has quit [Quit: Konversation terminated!]
khaled has joined #systemtap
khaled has quit [Client Quit]
lijunlong has quit [Read error: Connection reset by peer]
<fche>
in your cpumask variant of the patch, is it proper for cpumask_copy & cpumask_clear to come in that sequence?
<fche>
as opposed to clear first?
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast>
yeah, if we clear it first then we'll just see nothing
<fche>
oh wait
<fche>
you're clearing the input not the output, got it
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast>
fche, so I'm guessing after reading the bug report you don't think we can dodge the mutex?
<agentzh>
kerneltoast: it'll be great if you can redo the stress tests using the sample .stp script in that bugzilla ticket with your latest patch to make sure it is still working fine.
<kerneltoast>
hmm but the inode mutex hasn't been removed
<kerneltoast>
but i'll try that .stp anyway
<agentzh>
i know. just to make sure it has no other side effects in such extreme cases.
<agentzh>
like a specific stress test case.
<agentzh>
thanks
<kerneltoast>
agentzh, do you know where stap does the relayfs read in userspace?
<agentzh>
in staprun
<kerneltoast>
k
<agentzh>
iirc, staprun/mainloop.c
<fche>
kerneltoast, yeah
<agentzh>
func stp_main_loop()
<agentzh>
kerneltoast: and also in staprun/relay.c
<agentzh>
in reader_thread
<agentzh>
func
<agentzh>
the latter is more related i think.
<kerneltoast>
lol we don't need the inode mutex
<kerneltoast>
we just need to disable irqs on the local cpu when flushing
<kerneltoast>
i am going to cry :)
<kerneltoast>
i should've looked at relayfs earlier...
<agentzh>
wow
<kerneltoast>
staprun properly pins a reader thread to each cpu
<kerneltoast>
and only has that thread read that cpu's buffer
<agentzh>
there are two different modes?
<kerneltoast>
relayfs stores a buffer per cpu
<agentzh>
one pin'd one is not?
<agentzh>
just my vague impression.
<kerneltoast>
there are two modes in stap for some reason actually
<kerneltoast>
STP_BULKMODE is pinned mode
<agentzh>
yeah
<agentzh>
BULKMODE is not the default.
<agentzh>
and we don't use it for simplicity.
<agentzh>
because it requires an extra stap-merge step to collect the per-cpu output files.
<kerneltoast>
so BULKMODE would need to be the default and then the print flush function should have its own local irq save
<kerneltoast>
that is the alternative.
<agentzh>
it'll be a tough call for fche :)
<kerneltoast>
non-bulkmode is a clear abuse of relayfs
<agentzh>
since it seems to break backward compatibility.
<agentzh>
or you have ways to do it otherwise?
<kerneltoast>
which backward compatibility are you thinking of?
<kerneltoast>
bulkmode doesn't work on old kernels?
<agentzh>
bulkmode requires stap-merge to post-process the output.
<agentzh>
iirc
<agentzh>
it changes the way how users would normally use stap.
<kerneltoast>
the ALTERNATIVE alternative would be to have a single buffer for all CPUs protected by a single lock. not so great
<agentzh>
sounds tricky.
<agentzh>
the non-bulk mode indeed may overload cpu 0.
<agentzh>
i also complained about it in the past.
<kerneltoast>
we're technically doing the output merging already
<kerneltoast>
with this dance that my patch does
<agentzh>
the current default way is not pretty indeed.
<agentzh>
hopefully we can have something better.
<fche>
hmmmm
<agentzh>
and without forcing the user to always use stap-merge.
<fche>
now that I think about it (again), yeah it's weird that the kernel->user isn't the normal 1-buffer-per-cpu thing
<fche>
and then let userspace merge or not merge (depending on -b)
<kerneltoast>
i guess if the user wants non-bulkmode, we'd have a single buffer for all cpus
<fche>
at some point
<fche>
but that point does not have to be at the relayfs level
<fche>
it can be at the staprun/stapio STDOUT level
<kerneltoast>
it does need to be at the relayfs level. relayfs is designed to use per-cpu buffers
<kerneltoast>
non-bulkmode abuses relayfs by reading per-cpu data from different cpus
<kerneltoast>
which is why we end up needing the inode mutex
<fche>
yes I understand
<fche>
and I agree it doesn't smell right
<fche>
my point is we can emulate non-bulk mode by making stapio/staprun still receive all the per-cpu buffers
<fche>
but merge them (well, pipe them to stdout as fast as possible, probably on a per-line buffered basis, probably)
<kerneltoast>
how would we do that though
<fche>
well we already have N threads in stap* reading the buffers
<fche>
the trace$N files
<fche>
instead of writing to a separate file, they can each write to stdout in non "-b" mode
<kerneltoast>
the N threads are only pinned to each cpu in bulkmode though
<kerneltoast>
err are there really N threads in non-bulkmode?
<kerneltoast>
because there's this check inside relay.c:
<kerneltoast>
return -1;
<kerneltoast>
_err("This is inconsistent! Please file a bug report. Exiting now.\n");