fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
orivej has joined #systemtap
derek0883 has quit [Remote host closed the connection]
hpt has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
fdalleau_away has quit [Quit: Coyote finally caught me]
jistone has quit [Ping timeout: 246 seconds]
derek0883 has quit [Remote host closed the connection]
orivej has joined #systemtap
derek0883 has joined #systemtap
jistone has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
fdalleau has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
tonyj has quit [Remote host closed the connection]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
wcohen|lunch is now known as wcohen
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
baotiao has joined #systemtap
baotiao has quit [Client Quit]
mjw has quit [Quit: Leaving]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
fdalleau is now known as fdalleau_away
derek0883 has quit [Ping timeout: 272 seconds]
derek0883 has joined #systemtap
<agentzh>
fche: i remember stap would build fake ko modules just for extracting dwarf about kernel data types. but i cannot find any way to trigger it. will you kindly give some hints and clues? thanks!
<agentzh>
or temp ko modules that are never to be loaded into the kernel but just for dwarf data.
<agentzh>
this trick is indeed very useful when the kernel debuginfo is missing (and only kernel headers are present).
<agentzh>
we'd like to have a closer look at it since sometimes it won't work with kernel debuginfo is missing.
<agentzh>
*when
derek0883 has quit [Remote host closed the connection]
tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]
<fche>
look up typequery .ko's (from @cast() expressions)
<fche>
also something similar for kernel tracepoints tracequery .ko 's
<agentzh>
great. thanks!
derek0883 has joined #systemtap
<agentzh>
i was searching for "dwarf query" and "dwarf_query". definitely the wrong words :)
<fche>
righto
<agentzh>
btw, we've been fighting with ebpf ourselves lately and it was indeed a struggle.
<agentzh>
programming in ebpf is much harder than even kernel programming...
<agentzh>
i can never understand why they make the ebpf verifier such a mess and such a nightmare...
<agentzh>
its memory type inference is also buggy.
<agentzh>
like whether a register holds an address on stack or on the map.
<agentzh>
it could go wrong...
<agentzh>
serhei definitely has a lot of expertise here.
<agentzh>
even converting an integer to a string can be a struggle :)
<fche>
getting a strong sense that it wasn't so much designed overall as grown from a small initial scope, via lumps and bumps and growths for other applications
<fche>
yeah serhei wouild be the team expert; others @ RH know it much better
<kerneltoast>
ebpf verifier feels like what would happen if someone made a compiler using only regex
<fche>
hahaha good metaphor
<kerneltoast>
it's also got this strange dynamic of being super inefficient yet trying to shave off cycles wherever possible
<kerneltoast>
the verifier not only verifies
<fche>
I'm not sure whether the kernel has or avoids the tradition of premature optimization
<kerneltoast>
it collects a bunch of necessary info for running programs
<kerneltoast>
hah, the (linux) kernel loves optimization
derek0883 has quit [Remote host closed the connection]
<serhei>
oh yeah, theoretically I could port the stapregex dfa widget to generate bpf bytecode, now that they allow looping to ~1million-insns
<serhei>
I always assumed the design decisions of bpf made more sense in its original context of juggling network packets
<serhei>
but I'm not a network expert
<serhei>
and that's not the context stapbpf uses it in
<agentzh>
serhei: i tried porting the ubpf interprter to ebpf myself.
<agentzh>
it was a nightmare...
<agentzh>
ubpf == userland ebpf
<agentzh>
and it seems that the ebpf verifier does not really know how to count instructions (it's counting in the worst ever way).
<agentzh>
and the resulting ubpf on ebpf thing can only execute 5 ~ 7 ebpf instructions.
<agentzh>
before hitting the 1m insn limit.
<agentzh>
it's crazy.
<agentzh>
and i had to avoid using stack memory since the verifier does not really know how to track the stack memory usage as compared to other things like maps.
<agentzh>
so maybe you're just smarter than me :)
<agentzh>
i almost gave up for several times when doing this ubpf-on-ebpf project.
<agentzh>
it finally passed the verifier but can just interpret 5 ~ 7 instructions in the ubpf vm. no joy...
<agentzh>
porting dfa would be much harder since ebpf verifier does not allow general back-edges.
<agentzh>
and jump table is also out of reach (the ebpf jump table is for separate ebpf programs).
<agentzh>
serhei: or do you have special magic that we're not aware of?
<agentzh>
design patterns and clever programming paradigms, i mean.
<agentzh>
i guess, if we store the whole dfa into ebpf maps, then it might be doable.
<agentzh>
but not the goto-style dfa implementatin, i believe?
<serhei>
I suppose the giant case statement is what will give it trouble
* agentzh
has been dreaming of a code converter than can convert arbitrary ebpf programs into a form that can always pass the ebpf verifier.
<serhei>
much more so than the loops
<agentzh>
yeah, there is a giant switch/case statement.
<agentzh>
an interpreter is essentially of this form: for (i = 0; i < MAX_INSNS; i++) { switch (pc) { case xxx: case xxx: ... } }
<agentzh>
and when MAX_INSNS is 7 or so, it is already about to hit the 1m insn limit.
<agentzh>
i don't know if there's any tricks to make it better.
<agentzh>
the kernel's ebpf interpreter uses a jump table.
<agentzh>
which is not possible in ebpf program itself afaik.
<serhei>
well, using bpf_tail_call might slightly increase your limit to 32 insns :-}
<serhei>
I spent a few minutes thinking about ways to minimize the number of comparison insns in a non-jump-table case statement in a dfa. Doesn't apply to a bytecode interpreter however
<serhei>
(can't go wrong with binary search)
<serhei>
I'm pretty sure I ran into the 'verifier being tricky about stack memory issue' since I had to write a hack into the stapbpf translator to start by zeroing out the (known-used) region of the stack before doing anything else
derek0883 has joined #systemtap
* serhei
reads about bpf_iter as well. So many different klugy ways to implement loops but not really loops