fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged
hpt has joined #systemtap
<agentzh> hi guys, i've noted that stap may sometimes emit garbaled stdout output with lots of successive \0 even for simplest printf()/println() scripts. it's especially common when the machine is under load. is it a known issue?
<agentzh> i guess there might be some race or buffer mishandling in staprun's relayfs reader thread. any hints to debug such hard-to-reproduce issues in staprun?
gromero has quit [Ping timeout: 272 seconds]
gromero has joined #systemtap
derek0883 has joined #systemtap
_whitelogger has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
derek0883 has quit [Remote host closed the connection]
orivej has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
orivej has joined #systemtap
khaled_ has joined #systemtap
_whitelogger has joined #systemtap
orivej has quit [Read error: Connection reset by peer]
orivej_ has joined #systemtap
orivej_ has quit [Ping timeout: 256 seconds]
orivej has joined #systemtap
hassan64 has joined #systemtap
orivej has quit [Ping timeout: 258 seconds]
<hassan64> I need *.stp file, for which I can get syscalls, PID, Argument being parsed, return value etc, when I ran a sample on a target machine
orivej has joined #systemtap
<hassan64> I am using a command on target machine as "staprun -c /root/test.sh /root/test.stp"
<hassan64> in test.sh
<hassan64> Any help
<hassan64> "/root/analyzed_bin > /root/prog.log"
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
<hassan64> Cam someone answer my query
<hassan64> *Can \
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
<hassan64> can anybody help. I got an error while translating to *.ko module
<agentzh> hassan64: you can check out the documentation here: https://sourceware.org/systemtap/langref.pdf
<agentzh> it should provide everything you need to achieve what you want.
<hassan64> Thanks, but when I compile *.stp to *.ko, I got some error " error: ‘__NR_fadvise64’ undeclared
<hassan64> It has been a long time I am stuck with this error
<hassan64> Please help me solve this problem
<hassan64> using 4.1/0.177 systemtao
<hassan64> Can anybody help me solve this problem
<hassan64> It is related to systemtap
<agentzh> ah, cross compilation for arm...
<agentzh> maybe try compiling natively on an arm system?
<hassan64> there is no compile in target machine
<hassan64> *compiler
<hassan64> so only option is to have cross compilation
<hassan64> But I do not know, why it is causing error. Any changes at Kernel level will fix the issue OR is the issue with systemtap itself
<hassan64> ?
<hassan64> Any help agentzh
orivej has quit [Ping timeout: 264 seconds]
orivej_ has joined #systemtap
<agentzh> hassan64: not sure. i haven't tried cross-compiling stap modules targeting armv5.
<agentzh> sorry, i cannot really help here.
<hassan64> But what error is all about
<agentzh> btw, you may want to try the git master version. 4.1 looks old to me.
<hassan64> Can you guess, is it an error due to some kernel settings OR systemtap itself
<hassan64> send me the link
<agentzh> it could be either your incomplete armv5 kernel build tree, or the lack of support for this version of kernel in the systemtap runtime.
<hassan64> Do you want my kernel configuration?
<agentzh> btw, you can use stap -vvvv to see more details.
<hassan64> to check if I have incomplete settings
<agentzh> it should be defined in such locations in your kernel tree.
<agentzh> maybe it's your strace.stp script referencing the syscall.
<agentzh> you can always check.
<agentzh> seems like probe nd_syscall.* or probe nd_syscall.*.return introduced the fadvise64 syscall.
<agentzh> maybe you can explicitly list all the syscalls your arm kernel actually supports via syntax like "probe syscall.open, syscall.openat, ... { ... }"
<agentzh> for some reason, nd_syscall.* includes advise64 syscall but it is absent in your arm kernel tree.
<agentzh> sorry, end of day for me. gotta get some sleep. good luck.
orivej_ has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej_ has joined #systemtap
<hassan64> agentzh: I am compiling latest systemtap (master version) with command as "./configure --with-elfutils=$(pwd)/../elfutils-0.180
<hassan64> make &&make install (with sudo)
<hassan64> But when I use stap -V , It comes as "Systemtap translator/driver (version 4.3/0.170, commit release-4.3-2-g7a7016a121c0 + changes)"
<hassan64> Okay with 4.3 but what about elfutils. It shoul dhave been 0.180. isn't it ?
<hassan64> Please guide me
<mjw> --with-elfutils is silently ignored
<hassan64> Please help me compiling this .
<hassan64> Or systemtap compiled with 0.180 elfutils infact but not showing
<mjw> You'll have to build and install elfutils yourself and then configure systemtap with CFLAGS="-g -O2 -I.../where/you/installed elfutils headers" LDFLAGS="-L.../where/you/installed elfutils libs"
<hassan64> Can you please send me complete command. Thanks
<mjw> hassan64, how did you configure/build/install elfutils?
<hassan64> ./configure --prefix=/home/hassan/Desktop/sandbox_design/elf_dir/
<hassan64> make && make install
<hassan64> is it right >
<hassan64> ?
<mjw> ok, then for systemtap ./configure CFLAGS="-g -O2 -I/home/hassan/Desktop/sandbox_design/elf_dir/include" LDFLAGS="-L/home/hassan/Desktop/sandbox_design/elf_dir/lib" should work
<hassan64> Shouldn't be "-g O2" with LDFLAGS ?
<hassan64> ?
<hassan64> still getting "Systemtap translator/driver (version 4.3/0.170, commit release-4.3-2-g7a7016a121c0 + changes)"
<hassan64> where 0.180 goes
<hassan64> "./configure CFLAGS="-g -O2 -I/home/hassan/Desktop/sandbox_design/elf_dir/include" LDFLAGS="-L/home/hassan/Desktop/sandbox_design/elf_dir/lib"
derek0883 has joined #systemtap
<hassan64> mjw: Any help please
orivej has joined #systemtap
orivej_ has quit [Ping timeout: 256 seconds]
derek0883 has quit [Ping timeout: 256 seconds]
<mjw> hassan64, export LD_LIBRARY_PATH=/home/hassan/Desktop/sandbox_design/elf_dir/lib
<hassan64> after this, do configuration again
<hassan64> ?
<mjw> No, just run stap --version
<hassan64> same command as before
<hassan64> ok
<hassan64> should I set "export LD_LIBRARY_PATH=/home/hassan/Desktop/sandbox_design/elf_dir/lib" before configuration ?
<hassan64> I think I have messed up.
<hassan64> Systemtap translator/driver (version 4.3/0.180/0.170, commit release-4.3-2-g7a7016a121c0 + changes)
<mjw> no that is good
<mjw> It says it is using 0.180
<mjw> although it still claims to have been build against 0.170
<mjw> which is odd, so maybe my configure instructions were wrong.
<mjw> But you have a systemtap stap command that uses elfutils 0.180 there.
<hassan64> Btw, this export command should be given before configuration , right ?
<mjw> export LD_LIBRARY_PATH= says "use the runtime libraries found in this directory"
<mjw> hassan64, maybe it would help the build to realize it needs to compile/link against what is in that dir, but in theory the -I and -L flags should have done that.
<mjw> basically you have a somewhat unusual setup that nobody has tested. Normally the distro installed elfutils is fine/new enough.
<hassan64> so am I using updated versin of systemtap compiled with latest elfutils ?
<mjw> It certainly looks so :)
<hassan64> as per "Systemtap translator/driver (version 4.3/0.180/0.170, commit release-4.3-2-g7a7016a121c0 + changes)" says
<hassan64> But then why, when I use to convert *.stp to *.ko, I got an error
orivej has quit [Ping timeout: 258 seconds]
<hassan64> "error: invalid application of ‘sizeof’ to incomplete type ‘struct old_timex32’"
orivej has joined #systemtap
<hassan64> But works fine in systemtap 4.1/0.177
<mjw> hassan64, with the same kernel?
<hassan64> yes
<mjw> wow, also using arm64. you are living dangerously :)
<mjw> It doesn't ring a bell, this is code that hasn't changed since 2016
<hassan64> mjw: Now look at this
<hassan64> strace.stp was compiled with systemtap 4.1. No error this time
hpt has quit [Ping timeout: 246 seconds]
<hassan64> Any help mjw
sscox has quit [Ping timeout: 240 seconds]
orivej has quit [Ping timeout: 265 seconds]
orivej_ has joined #systemtap
orivej_ has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
<mjw> hassan64, is that the strace.stp example?
<hassan64> yes
<mjw> I don't have a clue
<mjw> I can try on my arm64 server, but it is really slow
<hassan64> can you please try this out
<mjw> sorry, my arm server is too slow and doesn't have the necessary linux headers/sources installed to replicate
<hassan64> Actually, I think the issue with systemtap
<hassan64> Same kernel version. One is failed other is successful
<hassan64> Whenever you have time please check this and let me know thanks
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
hassan64 has quit [Remote host closed the connection]
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
<mjw> I suspect it is some change in one of the syscal tapsets.
orivej has quit [Ping timeout: 260 seconds]
orivej_ has joined #systemtap
orivej_ has quit [Ping timeout: 240 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
wcohen has quit [Remote host closed the connection]
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
wcohen has joined #systemtap
sscox has joined #systemtap
tromey has joined #systemtap
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 246 seconds]
orivej_ has quit [Ping timeout: 256 seconds]
orivej has joined #systemtap
orivej has quit [Ping timeout: 260 seconds]
orivej has joined #systemtap
hassan64 has joined #systemtap
<hassan64> please fix this warning
<hassan64> WARNING: never-assigned local variable 'retstr'
<fche> what script triggers it?
<hassan64> strace.stp
orivej has quit [Ping timeout: 258 seconds]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 265 seconds]
orivej_ has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
sapatel has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
hassan64 has quit [Remote host closed the connection]
orivej has quit [Ping timeout: 246 seconds]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
orivej has joined #systemtap
derek0883 has joined #systemtap
orivej has quit [Ping timeout: 256 seconds]
derek0883 has quit [Remote host closed the connection]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
derek0883 has joined #systemtap
<agentzh> fche: will you please backlog and see my messages above (sent late yesterday)? many thanks!
<fche> agentzh, looking
<fche> the first issue, fix-vma-leak, I wouldn't bother with a NEWS blurb for it as a mere bug fix, but rather a sourceware.org/bugzilla entry you could open & close
<fche> the comment looks good, so feel free to comment runtime/vma.c
<fche> commit I mean
<agentzh> k
<agentzh> thanks for your reply.
<agentzh> for the garbaled output with lots of \0 in stap's output, it seems more likely in the kernel client side of relayfs instead of on the staprun side. the latter looks so simple it's hard to have issues.
<fche> could it be a length mismatch problem?
<fche> have a reproducer? (I know you said it's worse under heavy load but still)
<agentzh> what do you mean by "length mismatch problem"?
<fche> meaning we may be copying some sort of string-terminating \0
<fche> length off-by-one kind of thing
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 264 seconds]
orivej_ has quit [Ping timeout: 256 seconds]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
<agentzh> fche: a script that produces such things is as simple as this: https://gist.github.com/agentzh/b44384ad9f3804d85ddfbf771094a718
<agentzh> it only outputs literal strings.
<agentzh> the output had a substring of "\x{00}\x{00}\x{00}\x{00}\x{00}\x{00}\x{00}\x{00}\x{00}\x{00}"
<agentzh> quoted using the \x{} syntax by my other script.
<agentzh> *perl script
<agentzh> it's very rare.
<agentzh> tried both the old kernel on centos 7 and new kernel on fedora 28.
orivej has quit [Quit: orivej]
derek0883 has quit [Ping timeout: 264 seconds]
orivej has joined #systemtap
mjw has quit [Quit: Leaving]
tromey has quit [Quit: ERC (IRC client for Emacs 28.0.50)]
derek0883 has joined #systemtap
irker909 has joined #systemtap
<irker909> systemtap: fche systemtap.git:master * release-4.3-3-gac8f2b7c9 / NEWS configure configure.ac doc/SystemTap_Beginners_Guide/en-US/Book_Info.xml systemtap.spec testsuite/configure testsuite/configure.ac: configury: post-release version bump
derek0883 has quit [Ping timeout: 260 seconds]
derek0883 has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 180 seconds.]
orivej has joined #systemtap
orivej has quit [Quit: No Ping reply in 210 seconds.]
orivej has joined #systemtap
orivej_ has joined #systemtap
orivej has quit [Ping timeout: 240 seconds]
derek088_ has joined #systemtap
orivej has joined #systemtap
orivej_ has quit [Ping timeout: 264 seconds]
derek0883 has quit [Ping timeout: 258 seconds]
derek0883 has joined #systemtap
derek088_ has quit [Remote host closed the connection]
orivej has quit [Ping timeout: 264 seconds]
orivej has joined #systemtap
khaled_ has quit [Quit: Konversation terminated!]
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
derek0883 has quit [Ping timeout: 240 seconds]
derek0883 has joined #systemtap
derek0883 has quit [Remote host closed the connection]
derek0883 has joined #systemtap
sapatel has quit [Remote host closed the connection]
orivej has quit [Read error: Connection reset by peer]
orivej_ has joined #systemtap