fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
<gilfoyle> this is from stap -d /usr/lib64/python2.7/lib-dynload/timemodule.so -e 'probe process("/usr/lib64/libpython2.7.so.1.0").function("*").call { println(tid(), " ", ppfunc()) if (ppfunc() =~ "PyErr") print_ubacktrace() }'
<fche> yeah
<fche> isn't that PyErr_Occurred bit weird?
<gilfoyle> it is :)
<gilfoyle> possible handled exception would be my best guess
<gilfoyle> the other surprising thing is
<gilfoyle> it's only from a single thread
<fche> they're racing
<gilfoyle> (granted the code is sort of biased for keeping a single thread constantly running)
<gilfoyle> yup :)
<gilfoyle> anyway, I think I will shutdown for today. Thank you for your help. I will pop around and when I understand this better, I will write a blog post and spread the word. Thanks for your help fche!
<fche> np, righto
<fche> btw it may be worth also logging .function("*").inline functions
<fche> maybe
<fche> static PyObject *
<fche> _PyTime_t secs;
<fche> if (_PyTime_FromSecondsObject(&secs, obj, _PyTime_ROUND_CEILING))
<fche> {
<fche> time_sleep(PyObject *self, PyObject *obj)
<fche> return NULL;
<fche> if (secs < 0) {
<fche> (cpython Lib/Modules/timemodule.c)
<fche> maybe that FromSecondsObject call is the failing one
<fche> weird.
<gilfoyle> interesting :)
<gilfoyle> can you chain the .inline in the call above or do you need to create a separate probe?
<fche> comma-separate it
<fche> as in probe process("/usr/lib64/libpython2.7.so.1.0").function("*").call, process("/usr/lib64/libpython2.7.so.1.0").function("*").inline { }
<fche> or actually with new enough stap
<fche> probe process("/usr/lib64/libpython2.7.so.1.0").function("*").{call,inline}
<gilfoyle> that's sweet :)
<gilfoyle> I might give it a try at recompiling from upstream
<fche> yeah, one of our interns took the idea, built it in a few days, it was great
<fche> that function was in stap 3.0
<gilfoyle> but for now I'll stick with old method :)
<fche> so you may already have it
<gilfoyle> not in C7.2
<fche> nope, but will be in rhel7.3
<gilfoyle> awesome :)
<gilfoyle> could you tell me the commit id for this change? I'm cutious to see what changed :)
<fche> with the { } probe point bit ?
<gilfoyle> yup
<fche> commit 380d759b6aa80dd95bfd6208dd075fc4b9e4ed42
<gilfoyle> thank you
<gilfoyle> :)
<fche> hm that PyErr_Occurred is a test function ("has an exception occurred?")
<fche> in _PyTime_FromObject (Python/pytime.c)
hpt has joined #systemtap
<gilfoyle> oh :)
<fche> so there's probably no error there exactly
<gilfoyle> just a red herring then
<fche> yup, oops.
<fche> but in the actual loop, I don't see that PyEval_EvalFrameEx where the sys/sdt.h bit is added
<fche> maybe (just guessing here), all the body of that while loop refers to C functions in the library, rather than python code proper
<fche> maybe we need to trap PyCFunction_Call too with a sys/sdt.h marker
<gilfoyle> that's beyond me :)
<fche> yeah
<fche> just making a note here for us to contemplate
<gilfoyle> anyway, I really have to go now, but I'd be happy to continue tomorrow :)
<gilfoyle> (or at least here if you've come to a consensus)
<fche> g'night
<gilfoyle> hear*
<gilfoyle> gnight! cheers :)
gilfoyle has quit [Quit: leaving]
hpt has quit [Quit: leaving]
pwithnall has joined #systemtap
hchiramm has joined #systemtap
hchiramm has quit [Ping timeout: 250 seconds]
nkambo has joined #systemtap
mjw has joined #systemtap
ph7 has joined #systemtap
naveen_ has joined #systemtap
nkambo has quit [Ping timeout: 240 seconds]
nkambo has joined #systemtap
naveen_ has quit [Quit: naveen_]
hpt has joined #systemtap
hpt has quit [Quit: leaving]
ph71 has joined #systemtap
sjas_ has joined #systemtap
nkambo has quit [Remote host closed the connection]
jlebon_ has joined #systemtap
modem_ has joined #systemtap
ph7 has quit [Ping timeout: 260 seconds]
vbernat has quit [Ping timeout: 260 seconds]
jlebon has quit [Ping timeout: 260 seconds]
modem has quit [Ping timeout: 260 seconds]
sjas has quit [Ping timeout: 260 seconds]
nkambo has joined #systemtap
mjw has quit [Quit: Leaving]
vbernat has joined #systemtap
_whitelogger has joined #systemtap
ph71 has quit [Read error: Connection timed out]
ph7 has joined #systemtap
jhg_ has joined #systemtap
<jhg_> good old texlive
<jhg_> it always makes my systemtap debian build so happy
<jhg_> .spec ${with_docs}
<jhg_> nothing
<fche> is there a problem we can help with?
<jhg_> fche: no. I am just grousing. =)
<fche> carry on :-)
<fche> # apt-get build-dep systemtap not doing enough?
<jhg_> fche: it is overdoing it, imho
<jhg_> texlive
<fche> ah you prefer a tex-free existence?
<fche> I can barely conceive it
<jhg_> on the disposable vm's I use for these types of things, I try to be gently to folks who don't yet have wide internet pipes
<jhg_> texlive is a bit of a pig
<jhg_> I do love tex
<fche> aha
<jhg_> though that application but mere kilobytes
<fche> here on fedora land we have almost the opposite problem - texlive has been broken up into hundreds (!!) of subpackages
<jhg_> and they usually all require each other
<fche> not quite, but it makes it tricky to ensure all -our- prereqs are met (every single .sty used -> one more subpackage required)
* jhg_ nods
<jhg_> I need more Knuth in the texlive distribution
<fche> he's busy writing books rather than software AIUI
<fche> may he live long & prosper
* jhg_ nods
<jhg_> fche: some functions are not showing up in my stap -v -l 'module("libafs").function("afs_linux_storeproc")'
<fche> are any showing up?
<jhg_> yep
jistone_ has joined #systemtap
<jhg_> hmm
<jhg_> oh
<jhg_> my fault
<jhg_> stale kernel module
wcohen_ has joined #systemtap
modem has joined #systemtap
DuncanT_ has joined #systemtap
<jhg_> maybe
modem_ has quit [*.net *.split]
wcohen has quit [*.net *.split]
ggherdov`_ has quit [*.net *.split]
palmtenor has quit [*.net *.split]
jistone has quit [*.net *.split]
DuncanT has quit [*.net *.split]
jistone_ is now known as jistone
<jhg_> my /usr/share/systemtap/runtime/transport seems to be out of sync with the 4.7 kernel
<jhg_> stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
ggherdov`_ has joined #systemtap
palmtenor has joined #systemtap
ggherdov`_ has quit [Max SendQ exceeded]
<jhg_> maybe clear cache time
<fche> nah
<fche> it could be a routinish porting problem
<jhg_> yeah... it even does it for stap -v -e 'probe begin {printf("hello world\n"); exit()}'
<jhg_> not sure why it's still complaining about Kernel function symbol table missing [man warning::symbols], when the System.map-4.7.0 exists
<fche> maybe just not where stap is looking for it?
<fche> (strace -eopen ?)
<jhg_> permission denied
<fche> obtain permission :-)
<jhg_> heh
<fche> strace denied ... what a world
<jhg_> heh. for security
<fche> anyway generally one needs to use git stap for kernels younger than the stap release
<fche> future (git+) versions of stap identify the range of supported kernel versions in 'stap -V'
<jhg_> excellent
sjas_ is now known as sjas
DuncanT_ is now known as DuncanT
ggherdov`_ has joined #systemtap
<jhg_> still have the warning even with read access
<fche> which warning?
<jhg_> WARNING: Kernel function symbol table missing [man warning::symbols]
<fche> as per session.cxx:parse_kernel_functions
<fche> only two locations are checked
<fche> $BUILDTREE/System.map and
<fche> /boot/System.map-$VERSION
<fche> where is your System.map-4.7.0 ?
<jhg_> it's in boot
<jhg_> I also put one in lib/module path
<jhg_> it finds it
<fche> maybe its permissions were too limited?
<jhg_> opens as read-only
<jhg_> does it need write perms?
<fche> no
<jhg_> it was 600 root:root before
<fche> can you cat /boot/System.map-4.7.0 ?
<jhg_> file size zero now
<jhg_> hmm. something clobbered it
<jhg_> got it =)
<jhg_> why not through /proc/kallsyms at the end of the search path?
<fche> actually we do use it in another contexts, interestingly enough
<fche> but I bet if you cat it, you'll get a bunch of zeroes
<jhg_> hmmm... multiple staps installed. weeee
<jhg_> oh
<jhg_> head /proc/kallsyms worked
<fche> with nonzero addresses?
<jhg_> yep
<fche> wow. that's a nonsensical distro configuration then
<jhg_> it is root:root 666
<jhg_> probably by my hand
ggherdov`_ has quit [Excess Flood]
<jhg_> I may need to revert to 4.6
<jhg_> stap -V is reporting up to 4.6-rc
ggherdov`_ has joined #systemtap
<fche> that's just a matter of updating the message :)
<jhg_> all right!
<jhg_> hello world is passing
* fche agrees we should process kallsyms too, along with /boot/System.map* ... would you like to report it as a formal bug/rfe in the tracker, or shall I?
<jhg_> I got it
<jhg_> this part of kprobes?
<fche> component=translator
<fche> (but it doesn't matter that much)
<fche> thanks dude.
<jhg_> np =)
<jhg_> all right... now to track down this kernel module
<fche> righto
<fche> that strace bit can probably be turned on easily again if you care
<jhg_> where was that again?
<fche> cat /proc/sys/kernel/yama/ptrace_scope
<jhg_> strace worked
<fche> maybe
<jhg_> that's a 1
<fche> as root it'll work; if you echo 0 in there, ptrace should work as before.
<fche> weird, if that's the issue ... https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712740
<fche> pressure for security from kernel folks leads to restrictions; complains from users results in loosenings
<fche> it was ever thus
<jhg_> as far as I can tell, security is about making things not work
<jhg_> to a certain scope
<fche> it's terrible obstacle - exactly when impacts our work :0 otherwise it's a godsend
<fche> hm but actually strace CMD shouldn't have been nuked by even that yama widget.
<jhg_> it more-or-less worked
<jhg_> when I ran it before
<jhg_> though thanks for bringing that up
<jhg_> all right
<jhg_> I see the symbol I am looking for!
<jhg_> woohoo!
<jhg_> now... what was I doing again?
<fche> er something with afs?
<jhg_> heh
<jhg_> getting splice() to work with ERESTARTSYS
<jhg_> copy big file, hit ctrl-c, watch the fireworks
pwithnall has quit [Quit: pwithnall]