fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<kerneltoast>
fche, your BUGs didn't get hit
<kerneltoast>
it hung
<fche>
hmmmm
<fche>
do you have kexec/kdump installed there, so one can get core dump, and do some more detailed digging around?
<kerneltoast>
yes i've been getting vmcore dumps every time
<kerneltoast>
though i have to get them using the host
<fche>
ah neat. so together with the .ko file, there's a chance to do a better backtrace / look-see than just the panic message
<kerneltoast>
the VM itself is soft hung so kexec doesn't kick in
<kerneltoast>
feel free to suggest anything to check inside the vmcore
<fche>
I'd try to look around at the *c variable at various levels of the call stack involving stap_* functions
<kerneltoast>
I should note that i don't have much experience poking around vmcores (never had that luxury on embedded), so you'll have to gimme crash commands
<fche>
ummmm I do it infrequently enough I don't have the situation memorized
<kerneltoast>
excellent
<fche>
SHIP IT :)
* agentzh
heard "ship it".
<agentzh>
no patches to ship?
<kerneltoast>
nope
<kerneltoast>
wish we had some :)
<fche>
aw man
<fche>
slackers
<kerneltoast>
i can poop out lotsa patches, but fche won't take em :P
<fche>
thanks, but NO THANKS
<fche>
there is enough poop to play with here already
<kerneltoast>
i've been showering more than once a day to clean myself off from working on task finder
<kerneltoast>
and now we have soft lockup poop
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
<kerneltoast>
fche, how much longer are you willing to stall the release?
<fche>
monday methinks is a hard deadline.
<fche>
but nothing is stopping us from doing fixes/respins or even another release before too long
<kerneltoast>
whew so i don't need to stay up past midnight
<fche>
SLACKER
<kerneltoast>
:(((
<kerneltoast>
ok i'll try a little more today
<fche>
:)
<kerneltoast>
after i get some food
<kerneltoast>
don't you dare say SLACKER
<fche>
okay okay I take it back, layabout
* kerneltoast
buys plane ticket to canada
<kerneltoast>
fche, you don't have any 16 thread CPUs?
<kerneltoast>
oh, you can also try running the testsuite in parallel on a fedora -debug kernel
<kerneltoast>
that would kill my laptop
<fche>
vm-rawhide-64 5.10.0-0.rc1.20201028gited8780e3f2ec.57.fc34.x86_64 << rawhide, so w/ lockdep, a measly 4cpu there
<fche>
but running like stap -j10ish, can try a little heavier load
<kerneltoast>
4 thread cpu? ouch
lijunlong has quit [Read error: Connection reset by peer]
<fche>
hey it's the best of 2007ish, one of my home servers thankeweverymuch
lijunlong has joined #systemtap
<kerneltoast>
that's not even 14nm
<kerneltoast>
and we're at 14nm+++++++++++++++++++ now
<fche>
hey 2.6GHz per core is still not bad
<fche>
anyway
<fche>
my cpu will not feel inadequate no matter what you say about it
<fche>
it's proud
<fche>
btw is this hang one that appears on just one machine?
<fche>
ooh am seeing uprobes_onthefly.x running right now, how exciting
<kerneltoast>
no it appeared on my ryzen 4800H laptop, and now a centos 7 vm on an i9-9900K
_whitelogger has joined #systemtap
derek0883 has quit [Remote host closed the connection]
<kerneltoast>
the _stp_runtime_entryfn_get_context() return value is ignored
<fche>
hm, not sure that's a necessarily bad thing, lessee
mjw has quit [Quit: Leaving]
<fche>
ok but the BUG part is the extra one we added in, right?
khaled has quit [Quit: Konversation terminated!]
<fche>
oh no this is something you put in
<fche>
yeah I don't think a 0 is a bad situation in this case
<fche>
nothing is dereferencing c in the original version of the code
<fche>
as I understand the code, its purpose is to pretend the current cpu context is taken, so no probe handler starts running from some sort of reentrant or whatever basis
<kerneltoast>
yes but the 0 means it didn't grab the context
<kerneltoast>
because the context was already grabbed
<fche>
yeah, and I think that means it's not a problem
<fche>
if I decode the comment block just above
<fche>
the purpose is to ensure those transport-related locks are only held within -some- probe-handler-like context
<kerneltoast>
while you think, i shall test
<fche>
so 0 means one's already available, good enough
<fche>
please don't confuse my ramblings for thinking
<kerneltoast>
how much longer you gonna stay up?
<fche>
I'm already 80% catatonic
<fche>
so maybe another six hours
<fche>
no just kidding
<fche>
about finished here
orivej has quit [Ping timeout: 264 seconds]
amerey has quit [Remote host closed the connection]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
_whitelogger has joined #systemtap
_whitelogger has joined #systemtap
khaled has joined #systemtap
orivej has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
orivej has joined #systemtap
tux3 has quit [Changing host]
tux3 has joined #systemtap
orivej has quit [Ping timeout: 256 seconds]
orivej has joined #systemtap
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 264 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 272 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]