fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<agentzh>
re that is how any pent-up messages from a probe-end / probe-error would end up being brought to staprun userspace attention
<agentzh>
fche: will you elaborate here?
<agentzh>
it seems relay_close() will also defer the deallocation of the bufs when ref count reaches zero.
<agentzh>
and it also calls 'irq_work_sync(&buf->wakeup_work);' to wake up readers like relay_flush().
<fche>
I don't have any deep insight into this code. That's the only place where relay_flush() is being called, so maybe it doesn't actually fulfill an important role, seeing that staprun works fine long before shutdown.
<agentzh>
after reading a lot of code in relay.c, i start to wonder if relay_flush() is necessary in the first place. but the race condition in relay_flush() is still something worth fixing on the kernel side.
<fche>
if we don't suffer data loss (such as truncation of output from a probe-end message), we probably don't need the flush
<agentzh>
i'm more worried about the race in relay_switch_subbuf(). it is called in other contexts too, not just relay_flush().
<agentzh>
it is not protected by mutex_lock(&relay_channels_mutex);
<fche>
would expect a sleepy lock like mutex_lock to be unsafe from the atomic contexts stap probes run in
<agentzh>
hmm, it makes sense.
<agentzh>
do you have any test cases which may actually need the relay_flush()?
<agentzh>
i don't mind extreme ones :)
<agentzh>
btw, what exactly do you mean by "probe-end message"?
<agentzh>
is it "probe end {}"?
<fche>
yes
<fche>
(= probe oneshot as in your case)
<fche>
if it works reliably without the flush, let's do away with the flush.
<agentzh>
probe oneshot is probe begin { ... exit() }, right?
<fche>
yes.
<agentzh>
so i should test probe end instead, huh?
<fche>
same thing, doesn't matter
derek0883 has quit [Remote host closed the connection]
<agentzh>
k
<agentzh>
i'll emit a lot of output in such contexts.
derek0883 has joined #systemtap
<fche>
hm actually it's not exactly the same thing
<fche>
might as well test both
<agentzh>
sure
<fche>
probe begin runs early; probe end runs ... at shutdown
<agentzh>
something like probe end { for (i = 0; i < N; i++) println("...") } ?
<fche>
yeah
<agentzh>
okay, working on it.
<fche>
or a hexdump printf("%*M",length,addr) kind of thing
derek0883 has quit [Read error: Connection reset by peer]
derek0883 has joined #systemtap
derek088_ has joined #systemtap
derek0883 has quit [Remote host closed the connection]
<agentzh>
fche: tried printing for total 526KB strings in a single probe end {} and running the ko with 20 threads and checking the output file size, so far so good (260+ seconds now).
<agentzh>
(without relay_flush).
<fche>
nice
<agentzh>
when using relay_flush, the output file size test is actually failing due to the extra \0 bytes.
<agentzh>
but that's another story.
<agentzh>
will keep it running and try the same test on centos 7 and fedora 30.
<agentzh>
currently i'm on fedora 28.
<agentzh>
and then i'll try generating output in probe oneshot directly.
<fche>
probe oneshot = probe begin
<fche>
so a probe oneshot {} probe end {log("something else")} should be a decent test of both
_whitelogger has joined #systemtap
higgins has quit [Quit: Leaving]
higgins has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
irker726 has quit [Quit: transmission timeout]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has joined #systemtap
<agentzh>
fche: okay
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
khaled has joined #systemtap
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
rongzhj` has joined #systemtap
rongzhj` has quit [Ping timeout: 256 seconds]
rongzhj` has joined #systemtap
sscox has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 256 seconds]
orivej_ has joined #systemtap
orivej_ has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
<fche>
agentzh, wow, nice work dude.
rongzhj`` has joined #systemtap
rongzhj` has quit [Ping timeout: 260 seconds]
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
hpt has quit [Ping timeout: 264 seconds]
orivej has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
mjw has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
rongzhj`` has quit [Ping timeout: 258 seconds]
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej_ has joined #systemtap
sscox has quit [Ping timeout: 246 seconds]
Amy1 has quit [Ping timeout: 256 seconds]
Amy1 has joined #systemtap
irker513 has joined #systemtap
<irker513>
systemtap: smakarov systemtap.git:master * release-4.3-9-gdb91f0291 / testsuite/systemtap.bpf/bpf.exp: bpf.exp: tentative fix for bigmap1.stp hang on RHEL8
orivej_ has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
sapatel has joined #systemtap
sscox has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej_ has joined #systemtap
orivej_ has quit [Ping timeout: 260 seconds]
tromey has joined #systemtap
orivej has joined #systemtap
khaled has quit [Ping timeout: 264 seconds]
<irker513>
systemtap: wcohen systemtap.git:master * release-4.3-10-g2b2b6a622 / testsuite/systemtap.examples/general/sizeof.stp: Fix sizeof.stp to explicitly use kernel debuginfo if one not specified
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
sscox has quit [Ping timeout: 244 seconds]
khaled has joined #systemtap
derek0883 has joined #systemtap
<irker513>
systemtap: wcohen systemtap.git:master * release-4.3-11-g717b7dddd / testsuite/systemtap.examples/lwtools/fslatency-nd.stp testsuite/systemtap.examples/lwtools/fsslower-nd.stp: Use explicit @cast() operators to fslatency-nd.stp and fsslower-nd.stp
<irker513>
systemtap: wcohen systemtap.git:master * release-4.3-12-g7a28529f6 / : Remove the unneeded test_support check the lwtools meta info
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
<irker513>
systemtap: wcohen systemtap.git:master * release-4.3-13-g9eb37102d / testsuite/systemtap.examples/process/pfiles.stp testsuite/systemtap.examples/profiling/ioctl_handler.stp: Use explicit @cast() operators for pfiles.stp and ioctl_handler.stp
<irker513>
systemtap: wcohen systemtap.git:master * release-4.3-14-g3040d4e8d / testsuite/systemtap.examples/stapgames/tapset/gmtty.stp: Use explicit @cast() operators for stapgames/pingpong.stp tapset.
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
orivej_ has quit [Ping timeout: 264 seconds]
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
<agentzh>
fche: i successfully managed to make the race window as small as possible, not i cannot eliminate it completely since setting buf's offset/data/subbufs_produced/padding fields need several instructions. should we use spinlock there? but the relay_file_read() function (or any functions it calls) may also need to follow the same lock. what do you think?
<agentzh>
it happens every time relay writers need to switch subbufs.
<agentzh>
the stress test you suggested for testing the need for relay_flush() is actually very good at reproducing this race. my simpler small-data .stp script can no longer reproduce this race after i minimized the race window.
<agentzh>
the 250+ KB example can reproduce it after about half hour of 30-thread load testing.
<agentzh>
i think removing the relay_flush() call still makes sense since all my load test never lose data in the end. it only leads to corrupted data with \0 in the middle or extra \0 data in the end.
<agentzh>
because relay_flush() also calls relay_switch_subbufs, reducing the number of calls of relay_switch_subbufs defintely help reduce the rate of data corruption or garbage production.
<agentzh>
i think spin lock should be good enough for this since the critical section does not sleep and the lock contention does not sleep either.
mjw has quit [Quit: Leaving]
sscox has joined #systemtap
derek0883 has quit [Remote host closed the connection]