fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<extra_>
Hey fche, I was trying to follow-up on your response in the ticket but had some questions
<extra_>
regarding multiple threads, in the case of the first issue the breakpoint is at the program entrypoint, so it hasn't had time to spawn any additional threads or anything. For case two, this could be a concern, although I was reading that by default gdb stops all threads when one hits a breakpoint
<extra_>
I tried running stap2perf with the strace.stp script but wasn't successful in having it add any recognized trace points... Also, running stap with -DALABI didn't seem to produce any additional output or prevent the seg fault in the first issue described
<extra_>
what is the -DALIBI flag supposed to do?
<fche>
stap -DALIBI turns off the body of the probes within systemtap
<fche>
so even if they could somehow break the system, they will be compiled out
<fche>
but really that is too far fetched a thing to explain what you're seeing
tromey has quit [Read error: Connection reset by peer]
tromey has joined #systemtap
<extra_>
I wonder if SystemTap starting generates a signal to either gdb or the child process that causes it to modify program state in such a way that it causes errors with the Go runtime
<fche>
it should not
<fche>
when stap starts a child process via stap -c CMD then there is a brief ptrace-y synchronization at the startup, but that's brief and happens before the golang runtime gets a chance to start
<fche>
with stap -x CMD, there's nothing
<extra_>
I wonder why the SystemTap output shows activity associated with the traced process, then... Looking at that process with ps, before SystemTap runs it's in the 't' state, which the man page says means "stopped by debugger during the tracing"
<fche>
yeah, that doesn't make sense - I'm guessing gdb didn't actually stop the target program
<fche>
or not all of it
<extra_>
without ever running SystemTap you can set the breakpoint on the entry point and continue without any problem... Also, at that point the program itself shouldn't have been able to execute any of it's instructions, so it hasn't had a chance to spawn new threads, install new signal handlers, etc.
<extra_>
you'd think that if it was just an issue with gdb not stopping the full program then it would always have trouble
<extra_>
thinking about ways to see whether gdb does receive a signal, I tried using strace against gdb, but it looks like you can't ptrace-attach to a process that is ptrace-attached to another process
<extra_>
I could try using the strace.stp script against the gdb process to catch this output... Is it possible to run two instances of the same SystemTap script at the same time, specifying different pids for the -x value?
orivej has joined #systemtap
<fche>
sure is
<fche>
(and by the way we are working on a version of strace that can share a single target process with gdb)
<fche>
(real strace, not systemtap strace)
<extra_>
oh nice, that would be incredibly useful
<fche>
yeah, will work through a smarter gdbserver process that can multiplex strace (or multiple strace ... remote even) and gdbs
<fche>
scox has been working on that for some time
<extra_>
is there a blog or something I can follow so that i know when that gets released?
<fche>
hm good question ... we'll mention it on systemtap@sourceware.org even though it's not directly connected
<fche>
it should also show up on the gdb & strace releases before too long
<extra_>
ok awesome, thank you
gila has quit [Ping timeout: 264 seconds]
gila has joined #systemtap
<extra_>
it looks like when the strace.stp SystemTap script starts, gdb gets two SIGCHILD signals from the process being debugged (no signals appear to be sent to the process being debugged)
<extra_>
Also, strace.stp on the gdb process shows this in terms of system call activity:
<extra_>
Fri Jun 8 19:22:23 2018.502985 rt_sigreturn() = -4 (EINTR)
<extra_>
Fri Jun 8 19:22:23 2018.502995 rt_sigreturn() = -4 (EINTR)