#systemtap on 2018-10-19 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:25 orivej has quit [Ping timeout: 244 seconds]

01:30 wcohen has joined #systemtap

01:53 <fche> agentzh, that purported behaviour doesn't make sense to me, are you sure?

02:04 sscox has quit [Ping timeout: 245 seconds]

02:31 sscox has joined #systemtap

06:07 vbernat has quit [Quit: The future belongs to those who believe in the beauty of their dreams.]

06:07 vbernat has joined #systemtap

06:31 _whitelogger has joined #systemtap

07:01 agentzh has quit [Remote host closed the connection]

08:21 slowfranklin has joined #systemtap

08:22 gila has joined #systemtap

09:07 gila has quit [Quit: Textual IRC Client: www.textualapp.com]

09:13 gila has joined #systemtap

09:48 orivej has joined #systemtap

10:13 sscox has quit [Ping timeout: 268 seconds]

12:39 wcohen has quit [Ping timeout: 252 seconds]

13:08 sscox has joined #systemtap

13:30 slowfranklin has quit [Quit: slowfranklin]

14:07 slowfranklin has joined #systemtap

14:14 tromey has joined #systemtap

14:40 gila has quit [Quit: Textual IRC Client: www.textualapp.com]

16:03 wcohen has joined #systemtap

16:53 slowfranklin has quit [Quit: slowfranklin]

17:12 orivej has quit [Ping timeout: 246 seconds]

17:37 orivej has joined #systemtap

18:05 brolley has joined #systemtap

19:05 agentzh has joined #systemtap

19:07 <agentzh> fche: quite sure. can reproduce it with a minimal example. i'll create a PR then.

19:07 <agentzh> fche: btw, sprint_ustack() seems buggy. will you please have a quick look at this PR? https://sourceware.org/bugzilla/show_bug.cgi?id=23799

19:20 tromey has quit [Ping timeout: 252 seconds]

19:25 <fche> odd bug

19:25 <fche> unwinding related problems might be diagnosed with -DDEBUG_UNWIND=1

19:26 <fche> does print_ustack(ubacktrace()) work any better - i.e., skip the printf("...%s", sprintf(...))

20:21 wcohen has quit [Ping timeout: 260 seconds]

21:13 brolley has left #systemtap [#systemtap]

21:15 <agentzh> fche: thanks for the tip. will try that macro.

21:15 <agentzh> print_ustack(ubacktrace()) works.

21:16 <fche> how about log(sprint_ustack(ubacktrace())) ?

21:19 <agentzh> still empty

21:28 <agentzh> fche: okay, i see what's going on here...

21:28 <agentzh> function sprint_ustack:string(stk:string) { sprint_usyms(stk) }

21:29 <agentzh> it lacks a return statement...

21:29 <agentzh> seems like the existing test suite does not cover this tapset func.

21:29 <agentzh> i'll submit a patch for this little thing then.

21:29 <agentzh> stap is not perl anyway :)

21:36 sscox has quit [Ping timeout: 268 seconds]

22:06 <agentzh> fche: is this sprint_ustack patch good to commit? https://sourceware.org/ml/systemtap/2018-q4/msg00041.html

22:07 <fche> sure. wow :)

22:07 <agentzh> great

22:09 <fche> hm, it was broken back in 2012

22:09 <fche> commit ec12f84f

23:03 <agentzh> *nod*

23:03 <agentzh> seems like i'm the first one actually using it :)

23:11 <fche> sometimes that happens :)

23:11 <fche> thanks for your quick fix!

23:13 <agentzh> sure thing.

23:14 <agentzh> still investigating some weird stap assertion failures and segfaults. not sure if it's our own patches' faults.

23:14 <fche> stap assertion failures are relatively simple usually

23:14 <fche> gdb -args stap -p4 ....

23:15 <mjw> urgh, that was silly. Thanks for the testcase!

23:16 <agentzh> mjw: you implemented sprint_ustack? ;)

23:16 <agentzh> fche: any hint on the assert(values.empty()) in ~update_visitor()? i'm seeing it fails on my side for a big thing.

23:16 <agentzh> not sure what it's asserting.

23:17 <agentzh> and how it can fail.

23:17 <mjw> agentzh, yes, and I only added a buildok testcase...

23:17 <mjw> it did build ok...

23:18 <agentzh> mjw: i see. thanks for adding that in the first place! fixing is easy :)

23:18 <fche> hm that's usually some sort of staptree node that turned into 0 during a rewrite pass

23:19 <agentzh> fche: so there shouldn't be any such nodes?

23:19 <fche> depends; usually if so, only very temporarily. but yeah the asserts of course should never be able to fire

23:20 <agentzh> okay

23:27 <agentzh> i'll try making sense of the whole update_visitor class.

23:27 <fche> you'll want to look a few levels higher

23:27 <fche> do you have an error reproduction recipe?

23:28 <agentzh> no minimal example yet. the reproducible example is huge.

23:29 <agentzh> i'll try collecting more info in the assertion failure point in gdb.

23:29 <agentzh> fche: for a separate issue, can e->tok be NULL?

23:30 <agentzh> i'm seeing const_folder::visit_binary_expression() tries to deference e->tok in some of its code path where e->tok is actually NULL, thus leading to a segfault.

23:30 <fche> what does a (gdb) bt look like?

23:31 <agentzh> a min

23:49 <agentzh> fche: this is the bt for the e->tok NULL deref bug: https://gist.github.com/agentzh/55c74510f751d659a04397afb29cd5ca

23:49 <agentzh> thanks for having a look!

23:49 <agentzh> it's only reproducible with -vvv. it's not a normal code path.

23:50 <agentzh> wondering if we should just avoid it or it is something deeper (that is, e->tok should never be NULL in the first place).

23:50 <fche> yeah, we've had -vvv-only bugs like that, those are usually not too bad

23:50 <fche> lemme find another one like that

23:51 <fche> commit ad7ba27ae783211790751add8887b8d01b00b51b

23:52 <agentzh> fche: so e->tok *might* be NULL? it's normal?

23:53 <agentzh> is this patch good enough? https://gist.github.com/agentzh/03d6f9e9137ed269cbab46d18aacd0b7

23:53 <agentzh> this patch seems to fix my problem.

23:53 <fche> works for me

23:53 <fche> er, I mean, sure

23:54 <fche> curious why e->tok = 0 though

23:54 <agentzh> same here.

23:54 <fche> even synthetic nodes usually get -some- tok*

23:54 <agentzh> i'll try digging deeper then. it will make me sleep better :)

23:54 <fche> thanks

23:54 <agentzh> sure

23:55 <agentzh> i'll try reproducing the assertion failure now...2 min...