#systemtap on 2020-11-24 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:36 derek0883 has joined #systemtap

00:58 derek0883 has quit [Remote host closed the connection]

00:58 derek0883 has joined #systemtap

01:14 <kerneltoast> because they use prints and then read the log buffer directly

01:14 <kerneltoast> it's hacky

01:15 <fche> yes it's hacky

01:15 <kerneltoast> but they're reentrant anyway because they require a context* pointer

01:15 <fche> and I thought you had a variant that prints into the stap context string directly rather than the regular print buffer

01:15 <fche> but even apart from that

01:15 <kerneltoast> no i never did

01:15 <fche> ok

01:16 hpt has joined #systemtap

01:16 <fche> but still that doesn't seem like an exception from normal reentrancy policy ("prevent reentrancy")

01:16 <fche> if a backtracing function messes with the print buffer, it should not matter as it's run from a non-reentrant (architecturally intended) context

01:16 <fche> so no one else should be able to start mucking with that particular buffer

01:17 <kerneltoast> yeah exactly

01:18 <kerneltoast> so we could skip the reentrancy lock for them

01:18 <kerneltoast> and make them work again

01:18 <fche> as long as -something- is reliably preventing reentrancy!

01:19 <kerneltoast> yeah, which the context is

01:19 <kerneltoast> unless it's not and that's why the probe lock soft lockup happens :p

01:20 <fche> ok

01:20 <fche> well, at least as a part of this discussion, we both got clear that reentrancy is not a desirable capability really

01:20 <fche> and I don't think it ever was

01:21 <fche> so if we can reliably exclude that possibility, and simplify logic as we go, that would be good

01:21 <kerneltoast> yeah I'd probably implement it by adding a context pointer argument to the lock functions

01:22 <kerneltoast> if you have a context pointer, you don't need the reentrancy protection we're selling

01:23 <fche> yeah, maybe something like that

03:05 derek0883 has quit [Remote host closed the connection]

03:19 derek0883 has joined #systemtap

03:38 derek0883 has quit [Remote host closed the connection]

03:52 _whitelogger has joined #systemtap

03:53 <kerneltoast> there is a way to check if we're in an NMI but only on x86...

03:53 <kerneltoast> who needs stap on arm anyway

04:05 derek0883 has quit [Remote host closed the connection]

04:12 khaled has quit [Quit: Konversation terminated!]

04:16 khaled has joined #systemtap

04:24 derek0883 has joined #systemtap

05:04 _whitelogger has joined #systemtap

05:22 <agentzh> kerneltoast: we need aarch64 at least :)

05:49 _whitelogger has joined #systemtap

06:13 _whitelogger has joined #systemtap

06:15 derek0883 has quit [Remote host closed the connection]

06:17 derek0883 has joined #systemtap

06:27 <kerneltoast> heh yeah

06:27 <kerneltoast> this is just very painful to get around without some way of telling if we're inside an NMI

06:28 <kerneltoast> factoring out the reentrancy used for prints would be very messy

06:28 <kerneltoast> and easy to break

06:28 <kerneltoast> i might try and check to see if only the perf tracepoint is used inside an NMI

06:48 derek0883 has quit [Remote host closed the connection]

06:48 derek0883 has joined #systemtap

06:57 derek0883 has quit [Remote host closed the connection]

06:57 derek0883 has joined #systemtap

07:34 derek0883 has quit [Remote host closed the connection]

07:34 derek0883 has joined #systemtap

07:39 orivej has quit [Ping timeout: 260 seconds]

07:48 derek0883 has quit [Remote host closed the connection]

09:36 hpt has quit [Ping timeout: 272 seconds]

10:37 orivej has joined #systemtap

12:22 orivej has quit [Ping timeout: 260 seconds]

12:26 mjw has joined #systemtap

12:59 <fche> I don't see why we need to check whether we're in an nmi, as opposed to checking whether a cpu-exclusive lock is already taken - i.e., reentrancy of any sort, softirq exception whatever

15:02 amerey has joined #systemtap

17:00 <kerneltoast> fche, because factoring out the reentrancy that already exists is really messy

17:00 <kerneltoast> I tried it yesterday

17:00 <kerneltoast> It starts off okay

17:00 <kerneltoast> Then it kind of explodes

17:03 <kerneltoast> especially with the _stp_vlog usage

17:43 derek0883 has joined #systemtap

17:49 derek0883 has quit [Ping timeout: 264 seconds]

18:03 <kerneltoast> fche, i think i have a better idea

18:04 <kerneltoast> most prints are inside probes which have the context locked

18:04 <kerneltoast> there are a few outside of there for module exit n stuff

18:05 <kerneltoast> we can take the log's reentrancy lock if the context isn't held

18:05 <kerneltoast> and the prints outside of the probes are simple and don't need reentrancy

18:21 derek0883 has joined #systemtap

18:57 derek0883 has quit [Remote host closed the connection]

18:58 derek0883 has joined #systemtap

19:00 <fche> sounds plausible

19:04 derek0883 has quit [Ping timeout: 264 seconds]

19:17 derek0883 has joined #systemtap

19:21 <kerneltoast> fche, testsuite is running with my new idea

19:22 <kerneltoast> it depends on another patch i made, which i think you'll like

19:25 <kerneltoast> fche, here's both patches: https://gist.github.com/kerneltoast/3c5d65e9739e05cbf1aa6372f55673c4

19:25 <kerneltoast> ouch, the preliminary patch failed quickly

19:25 <kerneltoast> percpu: allocation failed, size=6768 align=32 atomic=0, alloc from reserved chunk failed

19:26 <kerneltoast> fche, i want to check if the runtime context is held, but _stp_runtime_get_context() can return NULL. if the context is already held, is _stp_runtime_get_context() guaranteed to return non-NULL?

19:27 <kerneltoast> my assumption is no

19:27 <fche> if it's already held, it should return NULL

19:27 <fche> it should preclude reentrancy

19:27 <fche> that's part of its purpose.

19:27 <kerneltoast> no that's a different function

19:28 <kerneltoast> all i want to do is read &c->busy

19:28 <kerneltoast> i don't want to modify it

19:28 <fche> why not call _stp_runtime_entryfn_get_context

19:29 <kerneltoast> because then we'll modify the runtime context state and potentially block a probe inside an IRQ from working

19:31 <kerneltoast> i can fixup my preliminary patch a bit

19:31 <fche> entryfn_get_context should do that already

19:32 <kerneltoast> yeah i mean i don't want to block anything by grabbing the runtime context myself

19:32 <fche> it's okay to do so, saves you from a TOCTOU anyway

19:32 <kerneltoast> but what if i grab it and then before i release it, an NMI comes flying outta nowhere and tries to run some probes

19:33 <kerneltoast> TOCTOU isn't an issue here i think because this is all running on the same CPU

19:34 <fche> fine, the nmi ones will fail, no problem.

19:34 <kerneltoast> haha

19:34 <kerneltoast> but it can be fixed!

19:34 <kerneltoast> i can make my first patch better

19:34 <kerneltoast> with the 'ol read_trylock

19:34 <kerneltoast> whaddya say?

19:37 <fche> well lemme see the patches.

19:37 <kerneltoast> ok lemme write it up

19:37 <fche> de-rcu-ification of the context stuff is good in general, /me is eager

19:38 <fche> yeah that first patch looks good

19:39 <fche> and I don't see a reason why nmi- or whatever code wouldn't use the entryfn_get/put_context pair to not just check but reserve the context for the duration of some critical section

19:39 <fche> BY THE WAY

19:39 <fche> the context could also be a place to put a print buffewr

19:39 <fche> just saying

19:55 <kerneltoast> yeah the only exception to the context being held for NMI stuff is if there's a print occuring outside the context protection, and then an NMI strikes and corrupts the log data

19:55 <kerneltoast> there are some prints in stap that occur without the context being held

19:55 <kerneltoast> like in systemtap_module_exit

19:55 <fche> ok, that's super late, but yeah we could grab a context there too

19:56 <kerneltoast> i suppose i could leave in the reentrancy lock as a backup measure then

19:56 <fche> if we use the _get_context gadget, it can BE the reentrancy lock

19:56 <kerneltoast> but i dunno about all the weird places a print could occur

19:57 <kerneltoast> also, what if we try to acquire a context in systemtap_module_exit and it fails?

19:57 <kerneltoast> do we spin?

19:57 <fche> then we don't print stuff

19:57 <kerneltoast> ah

19:57 <fche> but by that time I think the probes are shut down or shutting down, in practice it probably can't happen

19:58 <kerneltoast> do you know of any other prints that occur outside of a probe?

19:58 <fche> not off the top of my head

19:59 <kerneltoast> that's my worry :)

19:59 <fche> but we could let the compiler find them for us ... changing the printing api to take a context* as a parameter

19:59 <kerneltoast> i thought about that too and worried that we'd break stap scripts

19:59 <fche> how?

20:00 <kerneltoast> wouldn't all the prints inside stap scripts need to have the context passed as an argument?

20:00 <fche> stap scripts don't know about contexts

20:01 <fche> embedded-c code inside stap scripts can't call arbitrary runtime functions generally

20:01 <fche> there is a STAP_PRINT macro, which we could arrange to get the c* propagated.

20:02 <kerneltoast> so i guess we'd break some embedded C

20:02 <fche> I think probably not

20:08 <kerneltoast> fche, i'm pumping this rcu cleanup through the testsuite: https://gist.github.com/kerneltoast/89935e70335a200e51c2033b044a32b3

20:08 <kerneltoast> refactoring the print api to take a context pointer will take a bit of time

20:09 <fche> I am not sure we need that stp_context_lock / context_stop doodaad

20:09 <fche> it's probably harmless

20:10 <kerneltoast> if we don't need it then RCU was never needed

20:10 <fche> that stp_runtime_context_free bit is only called when the system knows that no more probes are running, IIRC

20:10 <fche> could be that RCU was always unnecessary in this particular context

20:10 <fche> er

20:10 <kerneltoast> * _stp_ctl_work_callback may still be running and looking for contexts.

20:10 <kerneltoast> /* We should be free of all probes by this time, but for example the timer for

20:10 <fche> excuse the context context pun

20:11 <kerneltoast> that's the comment on top of _stp_runtime_contexts_free

20:11 <fche> hmmmmmmm

20:11 <fche> I'd think we free context way way down the line, when probes and timers and tracepoints are all already cleansed

20:11 <kerneltoast> the timer in question is stopped via a file write from userspace

20:12 <kerneltoast> but yeah if you wanna YOLO it and nuke the synchronization we can do that

20:13 <kerneltoast> i can also test it from my end with printks to see if the lock doodaad is ever contented

20:15 <agentzh> fche: ah, i just found stap does not support the perl regex syntax...pity.

20:31 derek0883 has quit [Remote host closed the connection]

20:37 <fche> I'm not opposed to it as a belt-and-suspenders measure, but generally I'd expect it not to fire like that

20:40 derek0883 has joined #systemtap

20:40 mjw has quit [Quit: Leaving]

21:35 derek0883 has quit [Remote host closed the connection]

21:41 derek0883 has joined #systemtap

22:34 derek0883 has quit [Remote host closed the connection]

22:46 orivej has joined #systemtap

23:14 amerey has quit [Quit: Leaving]

23:25 derek0883 has joined #systemtap

23:31 derek0883 has quit [Ping timeout: 246 seconds]

23:46 derek0883 has joined #systemtap

23:51 derek0883 has quit [Ping timeout: 264 seconds]

23:57 thibaultcha is now known as chasum

23:59 derek0883 has joined #systemtap