fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
khaled has quit [Ping timeout: 264 seconds]
mjw has quit [Ping timeout: 265 seconds]
mjw has joined #systemtap
hpt has joined #systemtap
<agentzh> fche: cool
<agentzh> thanks
sscox has joined #systemtap
<agentzh> fche: will you shed some light on this issue? https://sourceware.org/bugzilla/show_bug.cgi?id=26537
<agentzh> it happens from time to time, especially in our stress test.
<agentzh> it looks like a deadlock to me.
sscox has quit [Ping timeout: 256 seconds]
sscox has joined #systemtap
linus2 has quit [Ping timeout: 260 seconds]
linus2 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
khaled_ has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
khaled_ has quit [Quit: Konversation terminated!]
khaled has joined #systemtap
orivej has joined #systemtap
<fche> agentzh, well, wouldn't call that part a deadlock; the loop is intentional
<fche> but something among the writers of that stp_task_work_callbacks atomic is not getting cleared - i.e., one of the related callbacks seems stuck
<fche> a systemwide stack traceback may help explain
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
hpt has quit [Ping timeout: 258 seconds]
orivej has quit [Ping timeout: 258 seconds]
zhuizhuhaomeng has joined #systemtap
<zhuizhuhaomeng> hello @fche I found an error 'ERROR: probe process("/usr/local/openresty/nginx/sbin/nginx").begin registration error (rc -114)' which is -EALREADY. I lookup the code and found it maybe return by __stp_utrace_attach with __STP_TASK_FINDER_EVENTS UTRACE_RESUME args.
<zhuizhuhaomeng> is it a real error or we can ignore?
<fche> like the error# agentzh was talking about yesterday, I suspect this relates to very short-lived processes
<zhuizhuhaomeng> how can i debug this problem?
orivej has joined #systemtap
<fche> zhuizhuhaomeng, not sure, sorry. there are no debugging prints in stp_utrace.c; maybe adding some is a start
zhuizhuhaomeng has quit [Quit: Leaving]
khaled has quit [Quit: Konversation terminated!]
khaled has joined #systemtap
sscox has quit [Ping timeout: 240 seconds]
amerey has joined #systemtap
sscox has joined #systemtap
derek0883 has joined #systemtap
<agentzh> fche: okay, something like 'foreach bt' in the crash session?
<fche> think so
<fche> don't remember the kgdb rune but yeah
<fche> if you have access to /proc/$$/stack or whatever you'd used in the bz, then yeah that but for other threads in the system
<agentzh> okay, will try.
<agentzh> fche: found that we can reproduce kernel panic relatively consistently in __stp_tf_clone_worker() on a 32c/64t system using kernel 4.14.
<fche> any idea about the cause/mechanism ?
<agentzh> just added more info to this gist.
<agentzh> it seems like this callback is called with NULL 'work' argument.
<agentzh> the dmesg buffer contains this (using the crash cmd 'log'): https://gist.github.com/agentzh/7cd62681ee7c304afde80bfe16da193b
<fche> ok, don't have a theory as to why work would be ull
<agentzh> the crashing source line is /usr/local/openresty-stap/share/systemtap/runtime/linux/task_finder2.c:956
<agentzh> struct utrace_engine_ops *ops = \
<agentzh> (struct utrace_engine_ops *)tf_work->data;
<agentzh> this line
<fche> but could add some defensive coding into the clone_worker gadget to skip such things
<fche> is this git/master systemtap?
<agentzh> it's not master, but should be recent enough.
<agentzh> yeah, the strange thing that it seems impossible to feed a NULL work arg.
<fche> is the work arg itself null or some content of the struct is? (as though it was freed/memset(0)'d ?)
<agentzh> i checked the assembly, the work arg should be the rdi reg which is 0.
<agentzh> and tf_work->data is (char*) work - 0x8 after the gcc optimization.
<agentzh> which matches the offending addr fffffffffffffff8
<agentzh> i looked at the kernel func task_work_run() and it does not seem possible to call work->func(work) with a NULL work...
<agentzh> not sure if there's any other places which would call this callback func.
<agentzh> brb
<fche> try collecting thread start/stop timestamp data, maybe someone else is clearing the work queue behind our backs
<agentzh> hmm, but task_work_run() has proper task level locks so it shouldn't happen in the first place?
<agentzh> we also got panics when loading the 3.10 kernel of centos 7.
<agentzh> but the panic call stack is kinda different.
<agentzh> for centos 7's 3.10 kernel, the backtrace is like this: https://gist.github.com/agentzh/acf3fee9c03eeb390418d26a91341a1c
<agentzh> fche: it's the work arg pointer itself being NULL in the kdump for the 4.14 kernel.
<agentzh> the thread start/stop timestamps would be huge, since we are doing stress testing here and lots of processes in the system coming and going.
<agentzh> i've checked the current process running the fatal task_work job, which is a bash process which is still running (task->state == 0).
<agentzh> and its task->task_works pointer is not NULL: task_works = 0xffff8a948cd77a60, and i've checked it still has 3 callback_head entries in the list whose ->func all point to utrace_resume.
<agentzh> though those 3 callbacks do belong to other concurrent stap modules, not the current one.
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
sscox has quit [Ping timeout: 260 seconds]
derek0883 has joined #systemtap
orivej has joined #systemtap
orivej_ has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]