fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
orivej has joined #systemtap
mjw has quit [Quit: Leaving]
khaled has quit [Quit: Konversation terminated!]
orivej has quit [Ping timeout: 252 seconds]
<irker573> systemtap: wcohen systemtap.git:master * release-4.4-148-gca8656afc / testsuite/systemtap.examples/index.html testsuite/systemtap.examples/index.txt testsuite/systemtap.examples/keyword-index.html testsuite/systemtap.examples/keyword-index.txt testsuite/systemtap.examples/metadatabase.db testsuite/systemtap.examples/security-band-aids/cve-2011-4127.meta testsuite/systemtap.examples/security-band-aids/cve-2011-4127.stp: example
<agentzh> serhei: i'm thinking a bit more about the abort() function. it seems tricky to do exceptions or longjmp in the ebpf context? i remember stap's kernel runtime uses horrible emulation hacks to do those. very inefficient hacks. not sure we have anything more clever in ebpf...
orivej has joined #systemtap
fdalleau_away is now known as fdalleau
khaled has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
ema has quit [Remote host closed the connection]
amerey has quit [Ping timeout: 240 seconds]
amerey has joined #systemtap
irker573 has quit [Quit: transmission timeout]
mjw has joined #systemtap
orivej has joined #systemtap
orivej has quit [Ping timeout: 252 seconds]
<serhei> the main thing abort does is immediately exit the probe -- not hard to do when we're building the CFG for the entire probe
<serhei> the second thing is does is set session_state to STAP_SESSION_STOPPING -- that may be more fiddly to replicate exactly in terms of how soon we stop other probes from running
<fche> serhei, bpf has try/catch now, that's all abort does in the lkm case, throw an exception right?
<serhei> in bpf, try/catch is just a fancy term for "find the enclosing catch block and insert a branch to it" fwiw
<fche> yes
<fche> same as in lkm
<serhei> yeah
<serhei> abort() is the same thing, but exits the probe unconditionally, skipping the catch block
<serhei> ah, don't even need to fiddle with CFG for that. Just call exit()
<fche> ah yes using that other context flag
sscox has quit [Quit: sscox]
sscox has joined #systemtap
sscox has quit [Quit: sscox]
sscox has joined #systemtap
tromey has joined #systemtap
irker216 has joined #systemtap
<irker216> systemtap: wcohen systemtap.git:master * release-4.4-149-g3ca13a712 / tapset/linux/netfilter.stp: Use the skb_frag_size accessor function rather than directly reading field
orivej has joined #systemtap
khaled_ has joined #systemtap
khaled has quit [Ping timeout: 268 seconds]
mjw has quit [Quit: Leaving]
<agentzh> fche: wow, ebpf supports try/catch already? that's news to me. will you mind share the lkm thread's link? google is not smart enough to find it, it seems. thanks!
<fche> lkm = linux kernel module, not lkml = mailing list
<agentzh> serhei: oh, i didn't know stap's bpf backend builds a full CFG. my concern here is it would be tricky if we want to print out a backtrace for exceptions in the future?
<agentzh> fche: okay...it means the kernel runtime of stap.
<fche> 'course it will be tricky, everything is tricky with bpf :)
<agentzh> serhei mentions "bpf has try/catch now". is that a builtin support or just in stap's bpf variant?
<fche> try/catch is a language level thing
<fche> not a bpf bytecode level thing
<fche> so .... 'no' ?
<agentzh> any pointer to that feature?
<agentzh> that's exciting!
<fche> well, .... stap script language try/catch is in multiple docs
<serhei> iirc try { error("woops") } catch {}
<fche> the new bit is that more of that works on the bpf backend than used to.
<agentzh> oh, you mean the stap language...
<serhei> yeah
<agentzh> i thought you mean the ebpf C language feature.
<fche> there is no bpf c language :-) it's just c
<agentzh> yeah, bcc's dialect does not really count.
<agentzh> so it seems like bpf native would still need stap's kernel module's way to emulate exceptions...like checking a flag after every func call and shortcut the current func's execution flow and then propagate upwards the calling chain?
<fche> sure
<agentzh> that would be slow though. longjmp would be nice for bpf.
<serhei> bpf native doesn't really have a calling chain
<serhei> just a blob of bytecode that calls out to a restricted list of kernel helper functions
<agentzh> i mean the bpf functions, not bpf helpers.
<agentzh> bpf already supports functions.
<agentzh> user-defined functions in the same bpf file.
<fche> yes, we have to manage the control flow through bpf (forward) jump insns
<fche> like a normal compiler
<fche> speed schpeed, if you wanted fast you wouldn't be using bpf :)
<agentzh> i see. so this is the secret weapon of stap to tackle the bpf verifier?
<agentzh> bpf verifier makes bpf programming a nightmare.
<fche> don't think this affects that at all
<agentzh> at least using the std tool chain.
<fche> it is still a nightmare
<agentzh> heh
<serhei> ah, found a patch thread for bpf's function feature (call bytecode pointing to other bpf code). We don't really use it
<agentzh> yep, we're already using it through the clang tool chain.
<agentzh> we're limited by 5-arg limitation in the func calls though.
<agentzh> thinking about workaronds like pushing extra args to bpf maps or something.
<serhei> hmm, another thing to evaluate
<agentzh> that won't be fast either.
<agentzh> but at least we can have backtraces (emulated ones at least).
<serhei> wonder if exiting the bpf 'function' by branching would work as a longjmp
<agentzh> like a direct goto?
<agentzh> across function boundries?
<serhei> make different copies of the function if they appear inside different catch blocks
<serhei> it would just save some duplication for programs that call tapset functions repeatedly
<agentzh> hmm, i'm a bit lost here. will you elaborate?
<agentzh> what do you mean by exiting by branching?
<serhei> currently all stap function calls are inlined
<agentzh> yep
<serhei> (for the bpf backend)
<agentzh> aye
<serhei> but that means that if we call a complicated tapset function 15 times
<serhei> we have 15 copies of the code
<agentzh> right, that's exactly what inlines do.
<serhei> and probably hit the insn limit
<agentzh> true
<serhei> that makes handling try/catch very easy since we know the location of the catch block statically
<agentzh> indeed
<serhei> if we make one copy of the tapset function and call it with the bpf call opcode, we don't know which catch block to branch to in case of an error
<agentzh> makes sense
<serhei> if we did know, we could probably fudge a longjmp by just doing a goto to the catch block
<agentzh> i see.
<serhei> at least in the case where the catch block is at the top level and isn't followed by a return
<agentzh> so bpf functions are still involved.
<agentzh> *not involved
<serhei> in any case
<serhei> the scheme I just came up with in my head is rather ugly
<agentzh> but it should work.
<agentzh> another convern i have with the all-inlining scheme is that the bpf verifier is very bad at verifying large functions.
<agentzh> we had to introduce functions to help the verifier.
<serhei> the all-inlining scheme is what we have right now
<agentzh> otherwise it would frequently give up saying the control flow is too complex...
<agentzh> every time i see a verifier error, i'll start throwing things on my desk...
<serhei> I don't think switching to functions would help much because handling try/catch will require us to be very judicious about what to inline and what to put into a function
<serhei> but I've never seen a 'control flow too complex' error with stap generated code
<agentzh> yeah, throw/catch would be much harder if we put bpf functions into the mix.
<agentzh> i'm just thinking along another line.
<serhei> I usually see an 'out of stack space' error
<agentzh> yeah, stack space is also common.
<agentzh> for us.
<agentzh> gotta run for a therapy. brb.
<serhei> thanks for the brainstorming
mjw has joined #systemtap
irker216 has quit [Quit: transmission timeout]
irker697 has joined #systemtap
<irker697> systemtap: amerey systemtap.git:master * release-4.4-150-g439fb4cc4 / dwflpp.h loc2stap.h session.h: Make declarations consistent with corresponding definitions
<agentzh> serhei: another quick question: how does stap handle string creations like "a" . "b" please? it's not obvious by reading the disassembly code of the .bo files emitted. i'm trying to understand the string length limits as mentioned by fche previously.
<agentzh> if we use a bpf map or a bpf ringbuf for string allocations, we no longer suffer from the current 64-byte or 128 byte limits?
<agentzh> is stapbpf allocating strings on the kernel C stack right now?
<fche> kernel c stack is not accessible to bpf
<fche> bpf bytecodes must alloc from bpf stack
<agentzh> sorry, i mean bpf stack.
<agentzh> the bpf stack seems to only allow static allocations like 'const char buf[] = "xxxx"'.
<agentzh> not much here.
<agentzh> no fancy toys like alloca().
<agentzh> afaik
<fche> correct
<agentzh> so stap bpf is allocating strings on bpf stack's static buffers?
tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]
<agentzh> fche: is that a yes?
<fche> stack != static but yes
<fche> the stack is AIUI the only place where one can allocate things in bpf land
<fche> (other than the kernel-side map/etc. data structures)
<agentzh> gotcha
<agentzh> thanks for the info
<agentzh> we're trying to achieve most of the stap kernel runtime capabilities in the ebpf route. it's a very bumpy road :)
<agentzh> as i said, the ebpf world is still in stone age as compared to the stap flagship runtime.
<agentzh> but i still have faith in ebpf just like serhei. since it shows a lot of promises.
<agentzh> the stock kernel ebpf is lame though, with all due respect :)
<agentzh> *stock kernel ebpf implementation
<fche> you might not believe it, but in another thread, .... ummm.... kernel ebpf is held up as the magical standard-bearer of capability
<agentzh> lol, i know there's a lot of hype around ebpf nowadays.
<agentzh> i wonder if they ever go deep enough.
<agentzh> or just happy enough with trivial things.
<fche> trivial things are still useful
<fche> but still.
<agentzh> true.
<agentzh> but our use cases definitely go way beyond trivial.
<agentzh> even beyond what stap's flagship runtime can do.
<fche> yup
<agentzh> :)
<fche> IMPOSSIBLE
<fche> inconceivable
<agentzh> :D
<fche> un not impossible
<agentzh> well, we already got rid of the ebpf verifier mostly in our own kernel.
<agentzh> we only keep the necessary info collection work in the verifier. the info is needed by jit compiler and interpreter.
<agentzh> so we cannot kill the whole verifier.
<agentzh> and we're adopting the same actions counter mechanism for safety.
<agentzh> as the stap kernel runtime.
<agentzh> we shall see how far we can go down this route :)
<agentzh> the stock ebpf verifier is to be damned...
<agentzh> that's all the source of troubles and pains.
<fche> ehhehhehehe
<agentzh> i can finally stop throwing things from my desk :D
<fche> brings back memories from, what, 2004, when stap had to choose a direction based on assumptions of what the kernel folks could live with
<fche> little virtual machines were on the table back then too, but looked hopeless
<agentzh> dtrace uses little in-kernel VM.
<fche> yes.
<agentzh> i believe in the in-kernel VM.
<agentzh> i don't like dynamic .ko loading and unloading...
<agentzh> especially when our customers require ko signing...
<fche> plenty of reasons not to Prefer it, but you might Need it anyway
<fche> you know stap has some signature capability, right?
<agentzh> why hopeless?
<fche> hopeless in 2004ish
<agentzh> through stap server?
<agentzh> that's still dynamic signing, no?
<fche> yes
<agentzh> our customers don't allow dynamic signing...
<fche> dynamic signing, yes, but with a key loaded into the mok/efi
<agentzh> they require manual auditing and signing process.
<agentzh> manual code auditing
<agentzh> ah
<fche> but they don't mind a frankenkernel with disabled bpf verifier? neato :)
<agentzh> no, they don't and they don't care.
<agentzh> just process.
<agentzh> you know.
<agentzh> *grin*
<fche> many such cases
<agentzh> most of our customers are not tech gurus like you guys.
<agentzh> they just want transparency but never really understand them.
<agentzh> it's the reality we live in.
<fche> yeah, we all make/design compromises
<agentzh> back to in-kernel VMs. do they look hopeful *nowadays*&?
<fche> well dunno, there is one now, so obviously
<agentzh> in your opinion?
<agentzh> okay, but they hook it up with a hopeless verifier.
<fche> but you see what's done with the verifier
<agentzh> right
fdalleau is now known as fdalleau_away
<agentzh> well, we will still embrace stap's kernel runtime for any feasible future. we're just trying something new for use cases the existing stap runtime does not handle very well.
<agentzh> like android, like kernel module (static) signing.
irker697 has quit [Quit: transmission timeout]
orivej has quit [Ping timeout: 240 seconds]
<serhei> wait, I have faith in ebpf?
<serhei> I just do things with it :/ it goes relatively smoothly because, on the contrary, I expect nothing from it
<serhei> so I can only be surprised in a positive direction