fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
derek0883 has quit [Remote host closed the connection]
derek088_ has joined #systemtap
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
beauty2 has quit [Ping timeout: 256 seconds]
beauty2 has joined #systemtap
hpt has joined #systemtap
orivej_ has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
khaled has quit [Quit: Konversation terminated!]
orivej has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
derek088_ has quit [Ping timeout: 265 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
khaled has joined #systemtap
mjw has joined #systemtap
hpt has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
wcohen has quit [Remote host closed the connection]
wcohen has joined #systemtap
derek0883 has joined #systemtap
ema has quit [Quit: reboot]
ema has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 246 seconds]
tromey has joined #systemtap
amerey has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 260 seconds]
derek0883 has joined #systemtap
tonyj has quit [Quit: leaving]
tonyj has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
_whitelogger has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast> fche, we meet again
<kerneltoast> ideas?
<kerneltoast> maybe i nuked too much of the print_flush function wrt those ifdefs
<agentzh> fche: are the bpf developers of stap around here? i just filed a PR for a bug in the bpf runtime of stap: https://sourceware.org/bugzilla/show_bug.cgi?id=27030
<agentzh> fche: i wonder why stap decides to generate bpf directly instead of using a C compiler like clang/llvm. maybe it's just for minimalism? but then we cannot use defs in kernel header files in the bpf backend?
<fche> agentzh, correcto
<fche> well, kernel headers/defs shouldn't be needed beyond dwarf, in theory
<agentzh> that'll be sad since bcc does allow kernel headers in the bpf C programs.
<fche> serhei has been the poor sucker ^W faithful developer of the bpf bits
<fche> yes, well bcc FORCES you to code in C so it must let you use their headers
<agentzh> serhei: howdy!
<agentzh> right
<agentzh> bcc is very strict.
<agentzh> believing in one true way of doing things.
<agentzh> a monolithic thing.
<agentzh> kerneltoast: sorry to hear that.
derek0883 has quit [Remote host closed the connection]
<kerneltoast> agentzh, but you lured fche out
<kerneltoast> now he has to respond to me
<agentzh> ah, right
<agentzh> my bad
<fche> bye
<kerneltoast> so all is well
<kerneltoast> nooooooooo
<fche> hm a systemic bunch of errors like that is a good sign
<fche> check out the .log file for corresponding segments
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast> is it all just timeouts
<kerneltoast> wait a second
<fche> maybe missing error messages?
<kerneltoast> the testsuite took 2.5x longer to run than usual
<kerneltoast> agentzh, were you loading glass on friday night?
<kerneltoast> i'm gonna re-run this test
<kerneltoast> *re-run the testsuite
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
<agentzh> kerneltoast: loading? nothing unusual.
<kerneltoast> hm maybe the per-cpu log thread was more intense than expected?
<agentzh> i didn't run the fuzz testing that day.
<agentzh> serhei: another PR regarding bpf filed: https://sourceware.org/bugzilla/show_bug.cgi?id=27031
<agentzh> fche serhei: so the bpf backend will solely rely on dwarf and hard-coded offsets for complex data structure handling?
<agentzh> kenrel data structures i mean.
<agentzh> fche: we have some customers with their kernels locked down, that's why i've been looking at the bpf solution.
<agentzh> when no ko can be loaded, bpf and ptrace are the only choice...
<agentzh> but it seems like stap's bpf support still have some crutial TODOs to be really useful for our purposes.
<fche> bpf is severely limited at the kernel level, so that also impacts things
<fche> but we are not quite up to the full level of the kernel bpf limitations in all respects
<agentzh> serhei: one more bugzilla PR: https://sourceware.org/bugzilla/show_bug.cgi?id=27032
<agentzh> fche: yeah, at this moment, we only need the following for userland tracing: 1. @var(), 2. user_long_error() and its friends, 3. -c CMD
<agentzh> they should be doable with the current bpf support in the kernel.
<agentzh> oh and 4. vma tracker
<agentzh> serhei: will you help us?
<agentzh> thanks in advance! :)
<agentzh> i've already filed PRs for the first 3 items.
<agentzh> regarding 4), i never got that far to be able to test.
<agentzh> the bpf backend of stap is indeed nice when it works.
<agentzh> much faster than the kernel runtime at least.
<agentzh> i mean the startup hit.
<agentzh> i have to say generating bpf instructions directly is quite brave...
<fche> hey we had some compiler experts on staff at the time who preferred to do it that way
<fche> and yeah not relying on the whole llvm enchilada is lovely
<agentzh> i really like the stapbpf utility, just as i like staprun :)
<agentzh> fche: yeah, of course. it's redhat.
<agentzh> it took me 21min to compile llvm/clang on my box.
<agentzh> with a 8c16t machine.
<agentzh> and then i found my llvm is too new to work with bcc.
<agentzh> alas.
<agentzh> and bcc requires llvm and kernel headers and everything everytime.
<agentzh> no separate compilation phase.
<agentzh> the only thing that be precompiled is the python script...
<agentzh> *that can be
<agentzh> just to the py bytecode.
<fche> yeah
<agentzh> the good thing about bcc is that i succeeded in reading virtual memory in the userland process via its bpf_probe_read_user() helper. but that's it.
<agentzh> stapbpf does not get me that far due to the broken user_long_error() tapset and @var() operator.
<agentzh> fche: is serhei in the same time zone as you?
<fche> yup
<agentzh> k
<fche> he doesn't decloak here very frequently :)
derek0883 has joined #systemtap
<agentzh> k
tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]
derek0883 has quit [Ping timeout: 272 seconds]
derek0883 has joined #systemtap
lindi- has quit [*.net *.split]
<serhei> agentzh, spotted those bpf PRs in my email
* serhei remembers something vaguely with user_string() and why I procrastinated with other user_ helpers
<serhei> ah it was the bpf_probe_read_{kernel,user}() helpers and how they were the same function in pre-recent Linux kernels
<serhei> agentzh, in any case, the first three wishlist items look doable
<serhei> less sure about vma tracker
<serhei> that may depend on stap runtime special sauce
* serhei almost considers the usefulness of writing our own bpf interpreter and linking it with the stap runtime
<agentzh> serhei: maybe we can implement the vma tracker in bpf.
<agentzh> the major motivation of using ebpf for us is to avoid loading ko modules.
<agentzh> if you link with the stap runtime, then it'll defeat our motivation ;)
<agentzh> so please don't...
<serhei> makes sense
<agentzh> serhei: also, i've noted that kernel 5.3+ allows simple loops like "for (i = 0; i < 50; i++)" in the ebpf verifier. so maybe stap's bpf backend can make use of it to implement the looping control flow structures in the stap language.
<agentzh> at least in the form of bounded loops.
<agentzh> before 5.3, loop unrolling was the only choice.
<serhei> or even implicitly bounded loops, sure
<agentzh> yeah, that was what i mean.
<agentzh> implicit bounded loops that is.
<agentzh> serhei: btw, how do you transfer the warnings and prints to stapbpf?
<agentzh> are you using the relay api like staprun?
<serhei> perf_event buffers
<agentzh> ah, okay, just like bcc.
<serhei> it's the only known sane way
<agentzh> bcc's outputs are crazy.
<agentzh> so much boring boilerplate code.
<agentzh> just for doing simple prints.
<agentzh> that makes sense if we can abstract it away.
<agentzh> like in stap.
<agentzh> serhei: i wonder if you guys would consider adding a parallel clang backend to the existing bpf emitter.
<agentzh> i kinda miss the capability of using kernel headers...
<agentzh> like in the kernel runtime of stap or in bcc.
<agentzh> many times the kernel debuginfo packages are missing anyway.
<agentzh> so the dwarf approach usually means finding the same kernel source and rebuilding the kernel ourselves.
<agentzh> which is quite painful.
<fche> hm, stapbpf can still use kernel headers via @cast() etc
<fche> and generate dwarf from them via gcc
<agentzh> oh really? there is such magic?
<agentzh> @cast() is a dwarf thing iirc.
<fche> yes
<fche> what do you want from kernel headers?
<agentzh> then we still need dwarf.
<fche> this dwarf is -generated- on the fly tho
<agentzh> oh that's interesting.
<agentzh> stap calls gcc to compile kernel headers into dwarf?
<agentzh> that sounds crazy.
<agentzh> (in a good way)
<agentzh> if that's the case, then i don't have any problems with that.
<agentzh> eager to try it out.
<agentzh> i'm very worried about the ebpf's 4k instruction count limit.
<agentzh> the 32-level tail-call may be helpful but is also tricky to do.
<serhei> (is it still 4k? I thought they were talking about raising it)
<agentzh> i hope they did!
<agentzh> did they?
<agentzh> i didn't check the git branch.
* serhei was following bpf-next lkml a few months ago but gave up on the info overload
<serhei> let me look it up
<agentzh> serhei: once you get the user_long_error() and etc implemented, we have large stap scripts to test it out :)
<agentzh> cool
lindi- has joined #systemtap
<serhei> oh. It's still 4096 :/ they just raised the limit on how long their verifier is willing to chase around loops to 1 million instructions
<serhei> > The only way to know that the program is going to be accepted by the verifier is to try to load it.
<serhei> and my corollary to that is
<serhei> > The only way to know why the program wasn't accepted by the verifier is to read the verifier source code.
khaled has quit [Remote host closed the connection]
khaled has joined #systemtap
<agentzh> serhei: yeah, that's very lame.
<agentzh> pity.
<agentzh> thanks for checking it out.
amerey has quit [Quit: Leaving]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap