fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<fche> agentzh, re. that hash table, sure
<fche> not sure all our uses of the table are rcu-compatible, so the spinlock might be unavoidable, but will consider
<agentzh> fche: okay, cool, will proceed accordingly and propose patches here soon.
<fche> great
<agentzh> and there's another thing, what do you think of adding other kinds of filters to the task finder?
<fche> such as?
<agentzh> currently there's only the pid filter and the exe path filter.
<agentzh> such as the gpid filter and mount file system filters.
<agentzh> sorry, i mean mount namespace id fiter
<agentzh> *filter
<agentzh> there seems to pid namespace filter already?
<fche> interesting, so processes within a given container?
<agentzh> *seems to be
<agentzh> yep
<agentzh> mount namespace is better than pid for containers.
<fche> plausible
<agentzh> especially when the file path is relative to the container.
<agentzh> gpid is for process group id.
<agentzh> it's also common for apache and nginx.
<agentzh> and postgrsql
<agentzh> many multi-process apps.
<fche> would you couple this with new variants of stap -x PID ?
<agentzh> yes, i'm also thinking about this line.
<agentzh> adding new options to stap and staprun.
<agentzh> so you're good with that?
<agentzh> we can propose patches for these too.
<agentzh> currently stap's vma tracker is tracking too many processes at the same time, which leads to high CPU contention on that vma hash table's spinlocks.
<agentzh> it's crazy.
<agentzh> very obvious cpu spikes when running the stap tool.
<agentzh> the kernel cpu flame graph shows 80% of the CPU is spent there.
<agentzh> more than 80%
<fche> that's crazy, is there a lot of fork/mmap activity going on?
<agentzh> just a lot of live processes and threads.
<agentzh> thousdans
<agentzh> in production servers.
<agentzh> not many forks or mmap activilities though.
<agentzh> *activities
<fche> hm then surprised of much ongoing vma tracker activity
<agentzh> i can show you the flame graph
<agentzh> a sec
<fche> (by the way, over here we were talking about possibly exposing the vma tracker's contents to stap tapsets,
<fche> so we could query the dso's mapped into processes, etc.)
<agentzh> is it different from @vma and @cast?
<fche> yeah to get an actual list of shared libraries / base addresses methinks
<fche> not sure it's needed but may be
<agentzh> i used to propose the @vma() operator patch to you.
<agentzh> but it seems i never got the chance to commit it...
<agentzh> @vma(addr, module)
<agentzh> it converts relative address in a module to the absolute address.
<agentzh> when there's no symbols or dwarf are incorrect.
<fche> aha yeah
<agentzh> if you're still open to it, we are happy to commit.
<agentzh> with some docs and tests too.
<fche> how about posting it, let others take a look
<agentzh> sure, will post it here
<agentzh> with a link
<agentzh> email is more troublesome.
<fche> or you could push the code to a new branch, and discuss it in email
<fche> that's how some of us do feature work ehre
<agentzh> yeah, that's fine too.
<agentzh> will do.
<agentzh> we don't really want to maintain these patches in our own branch.
<agentzh> every time we wan to sync with the upstream repo, it's painful :)
<agentzh> especially when you like to do big code refactoring from time to time.
<agentzh> it's nightmare for us ;)
<fche> understood
<agentzh> btw, kerneltoast is preparing a Dl stapio hang bugfix patch these days.
<agentzh> will be ready very soon as well.
<agentzh> we'll need you review definitely.
<fche> nice, you guys are doing good helpful work, thank you
<agentzh> it's tricky.
<agentzh> of course. and we're hiring more kernel developers to do more work :)
<agentzh> we'd like to move faster here.
khaled has quit [Quit: Konversation terminated!]
<agentzh> fche: the flame graph: https://pasteboard.co/Jw8xb0A.png
<agentzh> the middle 3 frames are:
<agentzh> 7a88b0: _raw_spin_lock[0]
<agentzh> 7277: adjustStartLoc[15]
<agentzh> 7277: adjustStartLoc[15]
<agentzh> because there's a chick & egg issue with -d MODULE for the current ko, we don't have symbols for the current ko.
<agentzh> so i also wonder if it's possible to inject dwarf-derived data into the ko after the ko is generated...
<agentzh> right now we have to do symbolization via post-processing and it's not always possible.
<agentzh> ah, the image paste service is slow in the US.
<agentzh> hopefully it's not that slow for you.
<fche> this is for backtracing ?
<agentzh> yes, sprint_backtrace()
<agentzh> just that.
<agentzh> in timer.profile probe
<agentzh> very simple script.
<fche> yeah, this too has come up in the past as ideas in the queue
<fche> so like we can pass STP_RELOCATION messages containing run-time addresses from staprun to the kernel module
<agentzh> we'll verify the effectiveness of our patches by sampling new flame graphs for comparison.
<fche> we could in principle send over giant messages containing .eh_frame* / .symtab* extracts of relevant processes
<agentzh> oh wow
<fche> (assuming we can identify relevancy, and the traffic rate is reasonable)
<agentzh> that's...bloody :)
<fche> well, it's taking the concept of in-situ backtracing etc. to its limits: try to keep the kernel module up with unforeseen userspace
<agentzh> i'm always worried about sending over large data over the channels.
<agentzh> yeah, the direction is definitely good.
<agentzh> now everytime the userland changes, we have to regenerate ko.
<agentzh> maybe eventually we just make stap's kernel runtime a reusable ko.
<agentzh> like ebpf.
<agentzh> and everything is sent over to the ko to run.
<agentzh> that would be the ultimate extreme :)
<agentzh> for the meantime, it's definitely useful to have some middle ground.
<agentzh> like sending over the unwinding data.
<agentzh> that would already be useful for bootstrapping the ko itself.
<agentzh> i was thinking about a less dynamic approach like attaching a special elf section to the ko and etc...
<agentzh> so it's still static, just after the ko is generated.
<agentzh> maybe this is easlier...
<agentzh> *easier
<agentzh> not sure.
orivej has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
derek0883 has joined #systemtap
sscox has quit [Ping timeout: 272 seconds]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
khaled has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
sscox has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
_whitelogger has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]