#systemtap on 2020-11-23 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:41 lijunlong has quit [Ping timeout: 240 seconds]

00:43 lijunlong has joined #systemtap

00:50 khaled has quit [Quit: Konversation terminated!]

01:14 hpt has joined #systemtap

01:34 derek0883 has joined #systemtap

01:35 derek0883 has quit [Remote host closed the connection]

01:36 derek0883 has joined #systemtap

01:37 derek0883 has quit [Remote host closed the connection]

01:38 derek0883 has joined #systemtap

02:02 derek0883 has quit [Remote host closed the connection]

02:05 derek0883 has joined #systemtap

02:49 lijunlong has quit [Ping timeout: 260 seconds]

02:51 lijunlong has joined #systemtap

03:21 derek0883 has quit [Remote host closed the connection]

03:57 derek0883 has joined #systemtap

06:03 derek0883 has quit [Remote host closed the connection]

06:11 derek0883 has joined #systemtap

06:23 derek0883 has quit [Remote host closed the connection]

06:28 derek0883 has joined #systemtap

07:16 khaled has joined #systemtap

07:18 derek0883 has quit [Remote host closed the connection]

07:37 _whitelogger has joined #systemtap

09:31 <ema> > Systemtap now supports extracting 64-bit floating point

09:31 <ema> \o/

09:54 hpt has quit [Ping timeout: 240 seconds]

10:30 mjw has joined #systemtap

12:31 derek0883 has joined #systemtap

13:37 derek0883 has quit [Ping timeout: 264 seconds]

14:09 <fche> ema, suggestions welcome as to what other arithmetic etc. processing would be helpful

15:05 amerey has joined #systemtap

17:19 orivej has quit [Ping timeout: 256 seconds]

17:35 orivej has joined #systemtap

17:53 derek0883 has joined #systemtap

18:05 derek0883 has quit [Remote host closed the connection]

18:06 derek0883 has joined #systemtap

18:07 <kerneltoast> fche, here are the test results: https://gist.github.com/kerneltoast/93d1f216c8fe1f5f21d8740422afc631#file-systemtap-sum-diff

18:08 <kerneltoast> but i need to make another change to my patch in order to prevent printing inside an NMI tracepoint from causing panics

18:08 <kerneltoast> if an NMI arrives on a CPU that's in the process of printing something, we'll have no choice but to drop all of the NMI's prints

18:09 <fche> results look okay, curious about that process_by_pid bit

18:09 * fche is not consistently here this week tho, so may delay comms

18:09 <kerneltoast> are ya gonna be around today?

18:10 <fche> yeah, probably 90 mins or so

18:10 <fche> 90 mins hence

18:10 <kerneltoast> ah just for the next 90 min?

18:11 <fche> gone for 90 mins

18:11 <fche> back later

18:11 derek0883 has quit [Ping timeout: 264 seconds]

18:11 <kerneltoast> o

18:12 <kerneltoast> cool, i'll add the NMI bits to my patch and give you a fresh testsuite diff

18:23 derek0883 has joined #systemtap

18:36 <kerneltoast> fche, this version has NMI protection: https://gist.github.com/kerneltoast/3d4878dcb68b0c959637fa0243b122bd

18:36 <kerneltoast> running the testsuite now

19:08 derek0883 has quit [Remote host closed the connection]

19:11 mjw has quit [Quit: Leaving]

19:24 orivej has quit [Ping timeout: 272 seconds]

19:40 derek0883 has joined #systemtap

19:45 derek0883 has quit [Ping timeout: 264 seconds]

19:54 derek0883 has joined #systemtap

20:16 <fche> dude

20:16 <fche> hm that was a little longer (teaching one of my boys to drive :-O)

20:18 <fche> would that nmi_lock be more simply named something like an anti-reentrancy lock?

20:26 <kerneltoast> it's literally just for NMIs though because IRQs are disabled

20:26 derek0883 has quit [Remote host closed the connection]

20:27 <kerneltoast> protip when learning how to drive: dedicate one foot to each pedal to react faster

20:27 <fche> good tip, will be sure to shred it

20:28 <kerneltoast> fche, if there were reentrancy other than from NMIs then the print driver would have exploded by now

20:29 <kerneltoast> just like your car will after your son drives it alone for the first time

20:29 <kerneltoast> source: my dad's '93 volvo died while i was driving it on the highway

20:30 derek0883 has joined #systemtap

20:30 <kerneltoast> i had been ignoring every light being lit up on the dashboard but the car still drove so my dad suggested i take it to the mechanic

20:31 <kerneltoast> turns out the car didn't last to the mechanic

20:31 <fche> errr, forget I mentioned it .... I need to sleep peacefully tonight :)

20:31 <kerneltoast> and i couldn't turn on hazards either because the battery died

20:31 <fche> wellthen you just do a flip & roll & crash, to make the hazard obvious

20:32 <fche> hth

20:32 <kerneltoast> of course, how could i have been so naive

20:38 <kerneltoast> so in summary, i think the name should be kept as is

20:38 <fche> hm

20:38 <fche> is the print_lock reentrant enough w.r.t. nmis?

20:38 <kerneltoast> yep read the comment above the read trylock

20:39 <fche> aha

20:39 <kerneltoast> the testsuite exposed this NMI issue

20:39 <fche> there are few sicknesses for which _trylock is not a cure

20:39 <fche> it's like a magic potion

20:39 <kerneltoast> cures everything

20:39 <kerneltoast> anti deadlock

20:40 <kerneltoast> can't cure mutex in IRQ though

20:41 <fche> ok, let us know how it likes the testsuite

20:44 <kerneltoast> I'll let ya know in an hour

20:44 <kerneltoast> It's still chug-a-lug lugging along

20:51 derek0883 has quit [Ping timeout: 265 seconds]

21:20 <fche> oh good

21:53 derek0883 has joined #systemtap

22:33 <kerneltoast> fche, new diff with the NMI stuff is more interesting: https://gist.github.com/kerneltoast/3d4878dcb68b0c959637fa0243b122bd#file-systemtap-sum-diff

22:34 <fche> a diff of the .log files would be good there

22:34 <kerneltoast> unless there's some reentrancy

22:34 <kerneltoast> maybe it's reentrancy

22:35 <kerneltoast> probably is i guess

22:35 <kerneltoast> yeah it is

22:35 <kerneltoast> and i know where it is

22:35 <kerneltoast> poo

22:35 <kerneltoast> in_nmi() isn't reliable

22:36 <fche> <bane> of course </bane>

22:37 <kerneltoast> well i dunno what to do about them NMIs

22:37 <kerneltoast> they just come outta nowhere and wreck stuff

22:37 <fche> do you have a dmesg or something so I can play along at home?

22:37 <kerneltoast> play along how

22:38 <kerneltoast> i have a log with a backtrace of one NMI wreaking havoc

22:39 <fche> yeah that

22:39 <kerneltoast> #0 [ffff88017a4c8ad8] _raw_read_lock at ffffffff81788414

22:39 <kerneltoast> #1 [ffff88017a4c8ae8] probe_6338 at ffffffffc067bfea [stap_b0645462cf706434a0a94992f03a9cf_17052]

22:39 <kerneltoast> #4 [ffff88017a4c8b58] __perf_event_overflow at ffffffff811a90e7

22:39 <kerneltoast> #2 [ffff88017a4c8b00] handle_perf_probe at ffffffffc067b4d0 [stap_b0645462cf706434a0a94992f03a9cf_17052]

22:39 <kerneltoast> #3 [ffff88017a4c8b48] enter_perf_probe_0 at ffffffffc067b703 [stap_b0645462cf706434a0a94992f03a9cf_17052]

22:39 <kerneltoast> #5 [ffff88017a4c8b90] perf_event_overflow at ffffffff811b28e4

22:39 <kerneltoast> #6 [ffff88017a4c8ba0] handle_pmi_common at ffffffff8100a9a0

22:39 <kerneltoast> #7 [ffff88017a4c8de0] intel_pmu_handle_irq at ffffffff8100ac7f

22:39 <kerneltoast> #8 [ffff88017a4c8e38] perf_event_nmi_handler at ffffffff81789031

22:39 <kerneltoast> #9 [ffff88017a4c8e58] nmi_handle at ffffffff8178a93c

22:39 <kerneltoast> #10 [ffff88017a4c8eb0] do_nmi at ffffffff8178ab5d

22:39 <kerneltoast> #11 [ffff88017a4c8ef0] end_repeat_nmi at ffffffff81789d9c

22:39 <fche> ok so what's breaking?

22:40 <kerneltoast> probes can be called from NMI context. if you have a print() inside such a probe, and the NMI arrives while there is already a print() in progress on the current CPU, bad things happen

22:40 <fche> by what mechanism

22:41 <kerneltoast> whaddya mean

22:41 <fche> as in ... what bad things happen, how?

22:42 <kerneltoast> the dump in my commit message is what can happen: https://gist.github.com/kerneltoast/3d4878dcb68b0c959637fa0243b122bd#file-0001-runtime-fix-print-races-in-irq-context-and-during-pr-patch-L13

22:42 <kerneltoast> the explanation for that is right above

22:43 <kerneltoast> in addition to that, there could be no panic and instead the print buffer would just get mangled

22:44 <fche> ok, I need to figure it out with small words, I'm on vacation officially so let's pretend I'm not very smart

22:44 <fche> (p.s. not pretending :-)

22:44 <fche> so

22:44 <fche> still trying to figure out the precise point at which something goes *wrong*

22:44 <kerneltoast> let's say i have this here code:

22:46 <kerneltoast> log->len += 10;

22:46 <kerneltoast> memcpy(&log->buf[log->len - 10], "1234567890", 10);

22:46 <kerneltoast> if (log->len + 10 > MAX_LOG_LEN) flush_da_buffer();

22:47 <kerneltoast> at the start of this code, log->len is MAX_LOG_LEN - 10, so flush_da_buffer() isn't called

22:47 <kerneltoast> now after that if-statement gets executed, an NMI comes flying out of the sky

22:48 <kerneltoast> log->len += 1337;

22:48 <kerneltoast> and the NMI decides to do this:

22:48 <fche> ok so my question there is why the nested nmi can't tell that the print subsytem is 'locked' already

22:48 <fche> so it can drop the output or whatever

22:48 <kerneltoast> because we lock it with local_irq_save()

22:49 <fche> well that get_context_() ditty used an explicit counter to detect reentrancy

22:49 <kerneltoast> there is some necessary print reentrancy that would be difficult to get rid of

22:50 <kerneltoast> we want to allow reentrancy on the local CPU

22:50 <kerneltoast> unless it's an NMI

22:50 <fche> "we want to allow reentrancy" .... are you quite sure?

22:50 <kerneltoast> yeah having a bunch of nested local_irq_saves should be allowed

22:51 <fche> yes that code allows it

22:51 <fche> but "want to allow" --- not necessarily.

22:52 <kerneltoast> NMIs just ruin our days

22:53 <fche> well why? I'd be perfectly happy with prevention of reentrancy, as that has been our architected model

22:53 <fche> we have those skipped* counters for this reason e.g.

22:53 <kerneltoast> it's just going to be a bit of work to get rid of the reentrancy i guess

22:54 <fche> detect & give-up is just fine

22:54 amerey has quit [Quit: Leaving]

22:54 <kerneltoast> testsuite disagrees

22:54 <kerneltoast> this breaks the stack printers

22:54 <kerneltoast> because they're implemented using bubblegum and duct tape

22:55 <kerneltoast> and parts from my dad's '93 volvo

22:56 <fche> well, think about it. I don't remember ever wanting or preferring reentrancy with respect to probes/etc.

22:56 <fche> I mean lots of things break that way, like the one-per-cpu context* structure we use

22:57 <kerneltoast> i'll need to make some _nolock functions

23:01 derek088_ has joined #systemtap

23:05 derek0883 has quit [Ping timeout: 264 seconds]

23:07 derek088_ has quit [Ping timeout: 264 seconds]

23:10 orivej has joined #systemtap

23:16 derek0883 has joined #systemtap

23:25 <kerneltoast> fche, i can just whitelist the stack functions from the reentrancy checks because they already have a context pinned

23:26 <kerneltoast> that might still be messy though...

23:26 <kerneltoast> blasted NMIs

23:27 <kerneltoast> i guess it'd work

23:33 <fche> not sure why the stack functions should be exceptional

23:49 derek0883 has quit [Remote host closed the connection]