#systemtap on 2020-08-26 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:45 khaled has quit [Ping timeout: 264 seconds]

00:47 mjw has quit [Ping timeout: 265 seconds]

00:52 mjw has joined #systemtap

01:21 hpt has joined #systemtap

01:23 <agentzh> fche: cool

01:26 <agentzh> thanks

01:54 sscox has joined #systemtap

02:03 <agentzh> fche: will you shed some light on this issue? https://sourceware.org/bugzilla/show_bug.cgi?id=26537

02:03 <agentzh> it happens from time to time, especially in our stress test.

02:23 <agentzh> it looks like a deadlock to me.

02:43 sscox has quit [Ping timeout: 256 seconds]

02:44 sscox has joined #systemtap

03:08 linus2 has quit [Ping timeout: 260 seconds]

03:12 linus2 has joined #systemtap

04:45 derek0883 has quit [Remote host closed the connection]

04:45 derek0883 has joined #systemtap

06:19 derek0883 has quit [Remote host closed the connection]

06:42 khaled_ has joined #systemtap

07:26 orivej has quit [Ping timeout: 260 seconds]

08:16 khaled_ has quit [Quit: Konversation terminated!]

08:18 khaled has joined #systemtap

09:29 orivej has joined #systemtap

10:05 <fche> agentzh, well, wouldn't call that part a deadlock; the loop is intentional

10:06 <fche> but something among the writers of that stp_task_work_callbacks atomic is not getting cleared - i.e., one of the related callbacks seems stuck

10:06 <fche> a systemwide stack traceback may help explain

10:08 derek0883 has joined #systemtap

10:08 derek0883 has quit [Remote host closed the connection]

10:14 hpt has quit [Ping timeout: 258 seconds]

10:17 orivej has quit [Ping timeout: 258 seconds]

10:36 zhuizhuhaomeng has joined #systemtap

10:48 <zhuizhuhaomeng> hello @fche I found an error 'ERROR: probe process("/usr/local/openresty/nginx/sbin/nginx").begin registration error (rc -114)' which is -EALREADY. I lookup the code and found it maybe return by __stp_utrace_attach with __STP_TASK_FINDER_EVENTS UTRACE_RESUME args.

10:49 <zhuizhuhaomeng> is it a real error or we can ignore?

10:50 <fche> like the error# agentzh was talking about yesterday, I suspect this relates to very short-lived processes

10:57 <zhuizhuhaomeng> how can i debug this problem?

11:03 orivej has joined #systemtap

11:29 <fche> zhuizhuhaomeng, not sure, sorry. there are no debugging prints in stp_utrace.c; maybe adding some is a start

11:48 zhuizhuhaomeng has quit [Quit: Leaving]

12:37 khaled has quit [Quit: Konversation terminated!]

12:39 khaled has joined #systemtap

13:58 sscox has quit [Ping timeout: 240 seconds]

13:59 amerey has joined #systemtap

16:06 sscox has joined #systemtap

16:57 derek0883 has joined #systemtap

17:04 <agentzh> fche: okay, something like 'foreach bt' in the crash session?

17:05 <fche> think so

17:05 <fche> don't remember the kgdb rune but yeah

17:05 <fche> if you have access to /proc/$$/stack or whatever you'd used in the bz, then yeah that but for other threads in the system

17:06 <agentzh> okay, will try.

17:11 <agentzh> fche: found that we can reproduce kernel panic relatively consistently in __stp_tf_clone_worker() on a 32c/64t system using kernel 4.14.

17:12 <fche> any idea about the cause/mechanism ?

17:12 <agentzh> the crash bt is like this: https://gist.github.com/agentzh/934ff6090bed25b3c1383d2837fbed18

17:13 <agentzh> just added more info to this gist.

17:13 <agentzh> it seems like this callback is called with NULL 'work' argument.

17:14 <agentzh> the dmesg buffer contains this (using the crash cmd 'log'): https://gist.github.com/agentzh/7cd62681ee7c304afde80bfe16da193b

17:15 <fche> ok, don't have a theory as to why work would be ull

17:15 <agentzh> the crashing source line is /usr/local/openresty-stap/share/systemtap/runtime/linux/task_finder2.c:956

17:15 <agentzh> struct utrace_engine_ops *ops = \

17:15 <agentzh> (struct utrace_engine_ops *)tf_work->data;

17:15 <agentzh> this line

17:15 <fche> but could add some defensive coding into the clone_worker gadget to skip such things

17:16 <fche> is this git/master systemtap?

17:16 <agentzh> it's not master, but should be recent enough.

17:17 <agentzh> yeah, the strange thing that it seems impossible to feed a NULL work arg.

17:18 <fche> is the work arg itself null or some content of the struct is? (as though it was freed/memset(0)'d ?)

17:20 <agentzh> i checked the assembly, the work arg should be the rdi reg which is 0.

17:20 <agentzh> and tf_work->data is (char*) work - 0x8 after the gcc optimization.

17:20 <agentzh> which matches the offending addr fffffffffffffff8

17:21 <agentzh> i looked at the kernel func task_work_run() and it does not seem possible to call work->func(work) with a NULL work...

17:21 <agentzh> not sure if there's any other places which would call this callback func.

17:22 <agentzh> brb

17:23 <fche> try collecting thread start/stop timestamp data, maybe someone else is clearing the work queue behind our backs

18:36 <agentzh> hmm, but task_work_run() has proper task level locks so it shouldn't happen in the first place?

18:37 <agentzh> we also got panics when loading the 3.10 kernel of centos 7.

18:37 <agentzh> but the panic call stack is kinda different.

18:38 <agentzh> for centos 7's 3.10 kernel, the backtrace is like this: https://gist.github.com/agentzh/acf3fee9c03eeb390418d26a91341a1c

18:40 <agentzh> fche: it's the work arg pointer itself being NULL in the kdump for the 4.14 kernel.

18:41 <agentzh> the thread start/stop timestamps would be huge, since we are doing stress testing here and lots of processes in the system coming and going.

18:42 <agentzh> i've checked the current process running the fatal task_work job, which is a bash process which is still running (task->state == 0).

18:43 <agentzh> and its task->task_works pointer is not NULL: task_works = 0xffff8a948cd77a60, and i've checked it still has 3 callback_head entries in the list whose ->func all point to utrace_resume.

18:44 <agentzh> though those 3 callbacks do belong to other concurrent stap modules, not the current one.

19:57 derek0883 has quit [Remote host closed the connection]

19:58 derek0883 has joined #systemtap

19:58 orivej has quit [Ping timeout: 240 seconds]

20:14 derek0883 has quit [Remote host closed the connection]

20:14 derek0883 has joined #systemtap

20:38 derek0883 has quit [Remote host closed the connection]

20:48 derek0883 has joined #systemtap

21:39 derek0883 has quit [Ping timeout: 240 seconds]

22:03 sscox has quit [Ping timeout: 260 seconds]

22:07 derek0883 has joined #systemtap

22:44 orivej has joined #systemtap

23:49 orivej_ has joined #systemtap

23:50 orivej has quit [Quit: No Ping reply in 180 seconds.]