#systemtap on 2021-05-05 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:05 orivej has joined #systemtap

00:17 mjw has quit [Quit: Leaving]

00:27 khaled has quit [Quit: Konversation terminated!]

02:23 orivej has quit [Ping timeout: 252 seconds]

03:00 <irker573> systemtap: wcohen systemtap.git:master * release-4.4-148-gca8656afc / testsuite/systemtap.examples/index.html testsuite/systemtap.examples/index.txt testsuite/systemtap.examples/keyword-index.html testsuite/systemtap.examples/keyword-index.txt testsuite/systemtap.examples/metadatabase.db testsuite/systemtap.examples/security-band-aids/cve-2011-4127.meta testsuite/systemtap.examples/security-band-aids/cve-2011-4127.stp: example

06:03 <agentzh> serhei: i'm thinking a bit more about the abort() function. it seems tricky to do exceptions or longjmp in the ebpf context? i remember stap's kernel runtime uses horrible emulation hacks to do those. very inefficient hacks. not sure we have anything more clever in ebpf...

06:24 orivej has joined #systemtap

06:55 fdalleau_away is now known as fdalleau

07:01 khaled has joined #systemtap

07:13 orivej has quit [Ping timeout: 260 seconds]

08:34 ema has quit [Remote host closed the connection]

09:16 amerey has quit [Ping timeout: 240 seconds]

09:17 amerey has joined #systemtap

11:35 irker573 has quit [Quit: transmission timeout]

11:41 mjw has joined #systemtap

11:46 orivej has joined #systemtap

12:27 orivej has quit [Ping timeout: 252 seconds]

12:57 <serhei> the main thing abort does is immediately exit the probe -- not hard to do when we're building the CFG for the entire probe

12:58 <serhei> the second thing is does is set session_state to STAP_SESSION_STOPPING -- that may be more fiddly to replicate exactly in terms of how soon we stop other probes from running

13:03 <fche> serhei, bpf has try/catch now, that's all abort does in the lkm case, throw an exception right?

13:13 <serhei> in bpf, try/catch is just a fancy term for "find the enclosing catch block and insert a branch to it" fwiw

13:13 <fche> yes

13:14 <fche> same as in lkm

13:14 <serhei> yeah

13:14 <serhei> abort() is the same thing, but exits the probe unconditionally, skipping the catch block

13:15 <serhei> ah, don't even need to fiddle with CFG for that. Just call exit()

13:15 <fche> ah yes using that other context flag

13:31 sscox has quit [Quit: sscox]

13:32 sscox has joined #systemtap

13:38 sscox has quit [Quit: sscox]

13:39 sscox has joined #systemtap

13:55 tromey has joined #systemtap

14:05 irker216 has joined #systemtap

14:05 <irker216> systemtap: wcohen systemtap.git:master * release-4.4-149-g3ca13a712 / tapset/linux/netfilter.stp: Use the skb_frag_size accessor function rather than directly reading field

14:07 orivej has joined #systemtap

14:47 khaled_ has joined #systemtap

14:47 khaled has quit [Ping timeout: 268 seconds]

16:39 mjw has quit [Quit: Leaving]

17:22 <agentzh> fche: wow, ebpf supports try/catch already? that's news to me. will you mind share the lkm thread's link? google is not smart enough to find it, it seems. thanks!

17:23 <fche> lkm = linux kernel module, not lkml = mailing list

17:23 <agentzh> serhei: oh, i didn't know stap's bpf backend builds a full CFG. my concern here is it would be tricky if we want to print out a backtrace for exceptions in the future?

17:24 <agentzh> fche: okay...it means the kernel runtime of stap.

17:24 <fche> 'course it will be tricky, everything is tricky with bpf :)

17:25 <agentzh> serhei mentions "bpf has try/catch now". is that a builtin support or just in stap's bpf variant?

17:25 <fche> try/catch is a language level thing

17:25 <fche> not a bpf bytecode level thing

17:25 <fche> so .... 'no' ?

17:25 <agentzh> any pointer to that feature?

17:25 <agentzh> that's exciting!

17:25 <fche> well, .... stap script language try/catch is in multiple docs

17:25 <serhei> iirc try { error("woops") } catch {}

17:25 <fche> the new bit is that more of that works on the bpf backend than used to.

17:25 <agentzh> oh, you mean the stap language...

17:26 <serhei> yeah

17:26 <agentzh> i thought you mean the ebpf C language feature.

17:26 <fche> there is no bpf c language :-) it's just c

17:26 <agentzh> yeah, bcc's dialect does not really count.

17:27 <agentzh> so it seems like bpf native would still need stap's kernel module's way to emulate exceptions...like checking a flag after every func call and shortcut the current func's execution flow and then propagate upwards the calling chain?

17:28 <fche> sure

17:28 <agentzh> that would be slow though. longjmp would be nice for bpf.

17:28 <serhei> bpf native doesn't really have a calling chain

17:28 <serhei> just a blob of bytecode that calls out to a restricted list of kernel helper functions

17:28 <agentzh> i mean the bpf functions, not bpf helpers.

17:28 <agentzh> bpf already supports functions.

17:28 <agentzh> user-defined functions in the same bpf file.

17:29 <fche> yes, we have to manage the control flow through bpf (forward) jump insns

17:29 <fche> like a normal compiler

17:29 <fche> speed schpeed, if you wanted fast you wouldn't be using bpf :)

17:29 <agentzh> i see. so this is the secret weapon of stap to tackle the bpf verifier?

17:29 <agentzh> bpf verifier makes bpf programming a nightmare.

17:30 <fche> don't think this affects that at all

17:30 <agentzh> at least using the std tool chain.

17:30 <fche> it is still a nightmare

17:30 <agentzh> heh

17:30 <serhei> ah, found a patch thread for bpf's function feature (call bytecode pointing to other bpf code). We don't really use it

17:30 <agentzh> yep, we're already using it through the clang tool chain.

17:31 <agentzh> we're limited by 5-arg limitation in the func calls though.

17:31 <agentzh> thinking about workaronds like pushing extra args to bpf maps or something.

17:31 <serhei> hmm, another thing to evaluate

17:31 <agentzh> that won't be fast either.

17:31 <agentzh> but at least we can have backtraces (emulated ones at least).

17:32 <serhei> wonder if exiting the bpf 'function' by branching would work as a longjmp

17:32 <agentzh> like a direct goto?

17:32 <agentzh> across function boundries?

17:32 <serhei> make different copies of the function if they appear inside different catch blocks

17:32 <serhei> it would just save some duplication for programs that call tapset functions repeatedly

17:33 <agentzh> hmm, i'm a bit lost here. will you elaborate?

17:33 <agentzh> what do you mean by exiting by branching?

17:33 <serhei> currently all stap function calls are inlined

17:33 <agentzh> yep

17:33 <serhei> (for the bpf backend)

17:33 <agentzh> aye

17:34 <serhei> but that means that if we call a complicated tapset function 15 times

17:34 <serhei> we have 15 copies of the code

17:34 <agentzh> right, that's exactly what inlines do.

17:34 <serhei> and probably hit the insn limit

17:34 <agentzh> true

17:34 <serhei> that makes handling try/catch very easy since we know the location of the catch block statically

17:34 <agentzh> indeed

17:35 <serhei> if we make one copy of the tapset function and call it with the bpf call opcode, we don't know which catch block to branch to in case of an error

17:36 <agentzh> makes sense

17:37 <serhei> if we did know, we could probably fudge a longjmp by just doing a goto to the catch block

17:37 <agentzh> i see.

17:37 <serhei> at least in the case where the catch block is at the top level and isn't followed by a return

17:37 <agentzh> so bpf functions are still involved.

17:37 <agentzh> *not involved

17:38 <serhei> in any case

17:38 <serhei> the scheme I just came up with in my head is rather ugly

17:38 <agentzh> but it should work.

17:38 <agentzh> another convern i have with the all-inlining scheme is that the bpf verifier is very bad at verifying large functions.

17:39 <agentzh> we had to introduce functions to help the verifier.

17:39 <serhei> the all-inlining scheme is what we have right now

17:39 <agentzh> otherwise it would frequently give up saying the control flow is too complex...

17:39 <agentzh> every time i see a verifier error, i'll start throwing things on my desk...

17:40 <serhei> I don't think switching to functions would help much because handling try/catch will require us to be very judicious about what to inline and what to put into a function

17:40 <serhei> but I've never seen a 'control flow too complex' error with stap generated code

17:40 <agentzh> yeah, throw/catch would be much harder if we put bpf functions into the mix.

17:40 <agentzh> i'm just thinking along another line.

17:40 <serhei> I usually see an 'out of stack space' error

17:41 <agentzh> yeah, stack space is also common.

17:41 <agentzh> for us.

17:42 <agentzh> gotta run for a therapy. brb.

17:42 <serhei> thanks for the brainstorming

18:40 mjw has joined #systemtap

19:25 irker216 has quit [Quit: transmission timeout]

19:38 irker697 has joined #systemtap

19:38 <irker697> systemtap: amerey systemtap.git:master * release-4.4-150-g439fb4cc4 / dwflpp.h loc2stap.h session.h: Make declarations consistent with corresponding definitions

19:39 <agentzh> serhei: another quick question: how does stap handle string creations like "a" . "b" please? it's not obvious by reading the disassembly code of the .bo files emitted. i'm trying to understand the string length limits as mentioned by fche previously.

19:39 <agentzh> if we use a bpf map or a bpf ringbuf for string allocations, we no longer suffer from the current 64-byte or 128 byte limits?

19:39 <agentzh> is stapbpf allocating strings on the kernel C stack right now?

19:40 <fche> kernel c stack is not accessible to bpf

19:40 <fche> bpf bytecodes must alloc from bpf stack

19:40 <agentzh> sorry, i mean bpf stack.

19:41 <agentzh> the bpf stack seems to only allow static allocations like 'const char buf[] = "xxxx"'.

19:41 <agentzh> not much here.

19:44 <agentzh> no fancy toys like alloca().

19:44 <agentzh> afaik

19:45 <fche> correct

20:11 <agentzh> so stap bpf is allocating strings on bpf stack's static buffers?

20:18 tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]

20:26 <agentzh> fche: is that a yes?

20:53 <fche> stack != static but yes

20:53 <fche> the stack is AIUI the only place where one can allocate things in bpf land

20:53 <fche> (other than the kernel-side map/etc. data structures)

21:07 <agentzh> gotcha

21:07 <agentzh> thanks for the info

21:07 <agentzh> we're trying to achieve most of the stap kernel runtime capabilities in the ebpf route. it's a very bumpy road :)

21:08 <agentzh> as i said, the ebpf world is still in stone age as compared to the stap flagship runtime.

21:08 <agentzh> but i still have faith in ebpf just like serhei. since it shows a lot of promises.

21:08 <agentzh> the stock kernel ebpf is lame though, with all due respect :)

21:09 <agentzh> *stock kernel ebpf implementation

21:09 <fche> you might not believe it, but in another thread, .... ummm.... kernel ebpf is held up as the magical standard-bearer of capability

21:09 <agentzh> lol, i know there's a lot of hype around ebpf nowadays.

21:10 <agentzh> i wonder if they ever go deep enough.

21:10 <agentzh> or just happy enough with trivial things.

21:10 <fche> trivial things are still useful

21:10 <fche> but still.

21:12 <agentzh> true.

21:12 <agentzh> but our use cases definitely go way beyond trivial.

21:12 <agentzh> even beyond what stap's flagship runtime can do.

21:12 <fche> yup

21:12 <agentzh> :)

21:12 <fche> IMPOSSIBLE

21:12 <fche> inconceivable

21:12 <agentzh> :D

21:12 <fche> un not impossible

21:13 <agentzh> well, we already got rid of the ebpf verifier mostly in our own kernel.

21:13 <agentzh> we only keep the necessary info collection work in the verifier. the info is needed by jit compiler and interpreter.

21:13 <agentzh> so we cannot kill the whole verifier.

21:14 <agentzh> and we're adopting the same actions counter mechanism for safety.

21:14 <agentzh> as the stap kernel runtime.

21:15 <agentzh> we shall see how far we can go down this route :)

21:15 <agentzh> the stock ebpf verifier is to be damned...

21:15 <agentzh> that's all the source of troubles and pains.

21:15 <fche> ehhehhehehe

21:16 <agentzh> i can finally stop throwing things from my desk :D

21:16 <fche> brings back memories from, what, 2004, when stap had to choose a direction based on assumptions of what the kernel folks could live with

21:16 <fche> little virtual machines were on the table back then too, but looked hopeless

21:16 <agentzh> dtrace uses little in-kernel VM.

21:17 <fche> yes.

21:17 <agentzh> i believe in the in-kernel VM.

21:17 <agentzh> i don't like dynamic .ko loading and unloading...

21:17 <agentzh> especially when our customers require ko signing...

21:17 <fche> plenty of reasons not to Prefer it, but you might Need it anyway

21:17 <fche> you know stap has some signature capability, right?

21:17 <agentzh> why hopeless?

21:18 <fche> hopeless in 2004ish

21:18 <agentzh> through stap server?

21:18 <agentzh> that's still dynamic signing, no?

21:18 <fche> yes

21:18 <agentzh> our customers don't allow dynamic signing...

21:18 <fche> dynamic signing, yes, but with a key loaded into the mok/efi

21:18 <agentzh> they require manual auditing and signing process.

21:19 <agentzh> manual code auditing

21:19 <agentzh> ah

21:19 <fche> but they don't mind a frankenkernel with disabled bpf verifier? neato :)

21:19 <agentzh> no, they don't and they don't care.

21:19 <agentzh> just process.

21:19 <agentzh> you know.

21:19 <agentzh> *grin*

21:19 <fche> many such cases

21:19 <agentzh> most of our customers are not tech gurus like you guys.

21:20 <agentzh> they just want transparency but never really understand them.

21:20 <agentzh> it's the reality we live in.

21:20 <fche> yeah, we all make/design compromises

21:21 <agentzh> back to in-kernel VMs. do they look hopeful *nowadays*&?

21:21 <fche> well dunno, there is one now, so obviously

21:21 <agentzh> in your opinion?

21:21 <agentzh> okay, but they hook it up with a hopeless verifier.

21:21 <fche> but you see what's done with the verifier

21:22 <agentzh> right

21:22 fdalleau is now known as fdalleau_away

21:23 <agentzh> well, we will still embrace stap's kernel runtime for any feasible future. we're just trying something new for use cases the existing stap runtime does not handle very well.

21:24 <agentzh> like android, like kernel module (static) signing.

22:37 irker697 has quit [Quit: transmission timeout]

23:14 orivej has quit [Ping timeout: 240 seconds]

23:47 <serhei> wait, I have faith in ebpf?

23:47 <serhei> I just do things with it :/ it goes relatively smoothly because, on the contrary, I expect nothing from it

23:47 <serhei> so I can only be surprised in a positive direction