fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
orivej has quit [Ping timeout: 244 seconds]
wcohen has joined #systemtap
<fche> agentzh, that purported behaviour doesn't make sense to me, are you sure?
sscox has quit [Ping timeout: 245 seconds]
sscox has joined #systemtap
vbernat has quit [Quit: The future belongs to those who believe in the beauty of their dreams.]
vbernat has joined #systemtap
_whitelogger has joined #systemtap
agentzh has quit [Remote host closed the connection]
slowfranklin has joined #systemtap
gila has joined #systemtap
gila has quit [Quit: Textual IRC Client: www.textualapp.com]
gila has joined #systemtap
orivej has joined #systemtap
sscox has quit [Ping timeout: 268 seconds]
wcohen has quit [Ping timeout: 252 seconds]
sscox has joined #systemtap
slowfranklin has quit [Quit: slowfranklin]
slowfranklin has joined #systemtap
tromey has joined #systemtap
gila has quit [Quit: Textual IRC Client: www.textualapp.com]
wcohen has joined #systemtap
slowfranklin has quit [Quit: slowfranklin]
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
brolley has joined #systemtap
agentzh has joined #systemtap
<agentzh> fche: quite sure. can reproduce it with a minimal example. i'll create a PR then.
<agentzh> fche: btw, sprint_ustack() seems buggy. will you please have a quick look at this PR? https://sourceware.org/bugzilla/show_bug.cgi?id=23799
tromey has quit [Ping timeout: 252 seconds]
<fche> odd bug
<fche> unwinding related problems might be diagnosed with -DDEBUG_UNWIND=1
<fche> does print_ustack(ubacktrace()) work any better - i.e., skip the printf("...%s", sprintf(...))
wcohen has quit [Ping timeout: 260 seconds]
brolley has left #systemtap [#systemtap]
<agentzh> fche: thanks for the tip. will try that macro.
<agentzh> print_ustack(ubacktrace()) works.
<fche> how about log(sprint_ustack(ubacktrace())) ?
<agentzh> still empty
<agentzh> fche: okay, i see what's going on here...
<agentzh> function sprint_ustack:string(stk:string) { sprint_usyms(stk) }
<agentzh> it lacks a return statement...
<agentzh> seems like the existing test suite does not cover this tapset func.
<agentzh> i'll submit a patch for this little thing then.
<agentzh> stap is not perl anyway :)
sscox has quit [Ping timeout: 268 seconds]
<agentzh> fche: is this sprint_ustack patch good to commit? https://sourceware.org/ml/systemtap/2018-q4/msg00041.html
<fche> sure. wow :)
<agentzh> great
<fche> hm, it was broken back in 2012
<fche> commit ec12f84f
<agentzh> *nod*
<agentzh> seems like i'm the first one actually using it :)
<fche> sometimes that happens :)
<fche> thanks for your quick fix!
<agentzh> sure thing.
<agentzh> still investigating some weird stap assertion failures and segfaults. not sure if it's our own patches' faults.
<fche> stap assertion failures are relatively simple usually
<fche> gdb -args stap -p4 ....
<mjw> urgh, that was silly. Thanks for the testcase!
<agentzh> mjw: you implemented sprint_ustack? ;)
<agentzh> fche: any hint on the assert(values.empty()) in ~update_visitor()? i'm seeing it fails on my side for a big thing.
<agentzh> not sure what it's asserting.
<agentzh> and how it can fail.
<mjw> agentzh, yes, and I only added a buildok testcase...
<mjw> it did build ok...
<agentzh> mjw: i see. thanks for adding that in the first place! fixing is easy :)
<fche> hm that's usually some sort of staptree node that turned into 0 during a rewrite pass
<agentzh> fche: so there shouldn't be any such nodes?
<fche> depends; usually if so, only very temporarily. but yeah the asserts of course should never be able to fire
<agentzh> okay
<agentzh> i'll try making sense of the whole update_visitor class.
<fche> you'll want to look a few levels higher
<fche> do you have an error reproduction recipe?
<agentzh> no minimal example yet. the reproducible example is huge.
<agentzh> i'll try collecting more info in the assertion failure point in gdb.
<agentzh> fche: for a separate issue, can e->tok be NULL?
<agentzh> i'm seeing const_folder::visit_binary_expression() tries to deference e->tok in some of its code path where e->tok is actually NULL, thus leading to a segfault.
<fche> what does a (gdb) bt look like?
<agentzh> a min
<agentzh> fche: this is the bt for the e->tok NULL deref bug: https://gist.github.com/agentzh/55c74510f751d659a04397afb29cd5ca
<agentzh> thanks for having a look!
<agentzh> it's only reproducible with -vvv. it's not a normal code path.
<agentzh> wondering if we should just avoid it or it is something deeper (that is, e->tok should never be NULL in the first place).
<fche> yeah, we've had -vvv-only bugs like that, those are usually not too bad
<fche> lemme find another one like that
<fche> commit ad7ba27ae783211790751add8887b8d01b00b51b
<agentzh> fche: so e->tok *might* be NULL? it's normal?
<agentzh> this patch seems to fix my problem.
<fche> works for me
<fche> er, I mean, sure
<fche> curious why e->tok = 0 though
<agentzh> same here.
<fche> even synthetic nodes usually get -some- tok*
<agentzh> i'll try digging deeper then. it will make me sleep better :)
<fche> thanks
<agentzh> sure
<agentzh> i'll try reproducing the assertion failure now...2 min...