#systemtap on 2020-04-15 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:03 przemoc has joined #systemtap

01:36 hpt has joined #systemtap

02:13 mjw has quit [Quit: Leaving]

03:39 orivej has joined #systemtap

04:34 _whitelogger has joined #systemtap

06:14 Amy1 has quit [Ping timeout: 246 seconds]

06:15 Amy1 has joined #systemtap

06:32 khaled has joined #systemtap

07:36 sscox has quit [Ping timeout: 252 seconds]

07:48 sscox has joined #systemtap

08:26 sscox has quit [Ping timeout: 252 seconds]

08:38 sscox has joined #systemtap

09:15 hpt has quit [Ping timeout: 240 seconds]

09:15 pviktori has quit [Remote host closed the connection]

09:16 pviktori has joined #systemtap

11:34 mjw has joined #systemtap

13:32 sapatel has joined #systemtap

13:51 tromey has joined #systemtap

14:00 amerey has joined #systemtap

15:22 khaled has quit [Remote host closed the connection]

17:15 <Amy1> how to run stap xx.stp yy.bin and run yy.bin in another path?

17:16 <Amy1> how to run ”stap xx.stp yy.bin“ and run ”yy.bin“ in another path?

17:16 <fche> you mean stap -c yy.bin &; cp yy.bin zz.bin; ./zz.bin ?

17:17 <Amy1> I mean that how to run ”stap xx.stp yy.bin“ and run ”yy.bin“ in another path?

17:17 <fche> right now, process probes are path-match based

17:18 <fche> so probe process("./a.out") ... and the running a.out from another path - even the same file - will not match

17:18 <fche> BUT

17:18 <fche> https://sourceware.org/bugzilla/show_bug.cgi?id=25568 should in fact fix this ^^ amerey

17:18 <Amy1> so, I must run stap and yy.bin in the same path, right?

17:18 <fche> process("path") probes could be rewritten to process(buildid) probes

17:18 <fche> Amy1, no, the constraint is not between the location of stap and yy.bin

17:19 <fche> it's between the location of yy.bin when stap processes the probe and the location of yy.bin when you run yy.bin

17:20 <Amy1> It's confusing

17:21 <Amy1> So how to run ”stap xx.stp yy.bin“ in ~/dir1 and run ”yy.bin“ in ~/dir2?

17:21 <fche> "stap xx.stp ~/dir2/yy.bin"

17:22 <Amy1> It's not coll

17:22 <Amy1> It's not cool

17:22 <Amy1> stap xx.stp "*yy.bin" is cool

17:23 <fche> can you show xx.stp ?

17:24 <Amy1> 200~probe process(@1).function("fun1")

17:24 <fche> if ~/dir2/ is in your PATH, then it should also work I think

17:25 <Amy1> PATH enviroment variable?

17:26 <fche> yes

17:28 <Amy1> no, it not work

17:29 <fche> seems to work for me

17:29 <fche> export PATH=$PATH:$HOME/dir2

17:29 <fche> stap xx.stp yy.bin (from any old directory)

17:29 <Amy1> I think stap should decouple with the executable binary.

17:30 <Amy1> fche: yes

17:30 <Amy1> but it no work

17:30 <Amy1> it not work

17:32 <fche> hm works here for me

17:32 <fche> what do you see?

17:33 <Amy1> nothing

17:33 <fche> does the stap job start correctly?

17:42 <Amy1> stap -e 'probe process("*ls").function("main"){printf("hello world\n");}'

17:42 <Amy1> such as, semantic error: glob *ls error (3)

17:43 <fche> you can't combine globs and paths

17:45 <Amy1> ok, it works.

17:45 <Amy1> fche: thanks.

17:45 <amerey> Amy1, in the future being able to give a build-id instead of an executable path will make this easier

17:45 <fche> righto

17:45 <fche> amerey, stap could map process("path") things to buildid-based probes too

17:45 <fche> and esp stap -c ./cmd process.FOO probes too

17:45 <fche> probably as a (default) option

17:46 <amerey> fche, true it shouldn't take much to add that

17:46 <amerey> will do that

17:47 <Amy1> amerey: when you makes it, tell me, thanks.

17:48 <amerey> Amy1, sure

17:57 <Amy1> an problem here

17:58 <Amy1> If multi paths has yy.bin, so how to do?

17:58 <Amy1> If multi paths have yy.bin, so how to do?

17:59 <Amy1> I want to run yy.bin in different paths.

17:59 <Amy1> They are different version of yy.bin.

18:02 <fche> set your PATH carefully

18:04 <Amy1> no

18:05 <Amy1> If there are only one ~/dir2/yy.bin, then it's ok.

18:06 <Amy1> I want to know if there are two yy.bin. one is ~/dir1/yy.bin, another is ~/dir2/yy.bin.

18:06 <Amy1> I want to open one stap xx.stp,then run yy.bin in dir1, and then run yy.bin in dir2.

18:07 <Amy1> But the stap can trace the two.

18:07 <Amy1> and not to stop and modify PATH.

18:08 <fche> two stap probes then

18:08 <fche> or probe process("/path/*/*/yy.bin") glob

18:14 <Amy1> probe process("*/yy.bin") ？

18:16 <Amy1> semantic error: glob /home/amy/*/*/MD error

18:24 <Amy1> fche: when I add ~/dir1 and ~/dir2 to PATH, it not only work for ~/dir1/yy.bin

18:24 <Amy1> not work for ~/dir2/yy.bin

18:25 <fche> you'd need a glob pattern that includes both binaries

18:25 <fche> or one probe per binary

18:30 khaled has joined #systemtap

18:31 <Amy1> so how to write glob to match ~/dir1/yy.bin and ~/dir2/yy.bin?

18:31 <fche> ~/dir*/yy.bin

18:31 <fche> or ~/{dir1,dir2}/yy.bin I think

18:32 <Amy1> ~/*.yy.bin ?

18:32 <Amy1> ~/*/yy.bin ?

18:32 <Amy1> not work?

18:32 <fche> well ~ may not work there

18:32 <fche> /home/YOU/*/yy.bin ?

18:33 <Amy1> this is also not work.

18:33 <Amy1> semantic error: glob /home/amy/*/*/MD error

18:33 <fche> work out a glob that works in the shell

18:34 <fche> wihtout ~

18:34 <fche> or use separate probes per binary

18:35 orivej has quit [Ping timeout: 264 seconds]

18:39 <Amy1> if there are no pattern about path1 and path2, how to write the glob?

18:39 <fche> {/path1,/path2} maybe

18:39 <fche> or

18:39 <fche> one probe per path

18:39 <fche> keep trying stuff

18:43 yogananth has quit [Read error: Connection reset by peer]

18:44 <Amy1> semantic error: no match

18:48 <Amy1> if I use process("/home/amy/*/MD") and run MD in /home/amy/dir2

18:48 <Amy1> the result is wrong

18:48 <Amy1> all values are 0

18:50 <fche> should work ... stap -t to trace counts of probe hits

18:55 <Amy1> WARNING: missing unwind/symbol data for module

18:55 <Amy1> all prited values are 0

18:56 <Amy1> stap traced values are all error

19:03 <Amy1> fche: ?

19:12 <agentzh> Amy1: maybe you can try stap -vvvv to see more details on your side.

19:12 <agentzh> i've never used globs in probes myself though.

19:15 <agentzh> fche: i see stap currently uses vmalloc for maps and ctxs. but the mempool and task finder still use kmalloc for memory blocks potentially larger than the page size. shall we switch the latter two to vmalloc?

19:16 <agentzh> i also see kmalloc is used for potentially large blocks in stap runtime's addr-map. is it worth migrating too?

19:17 <agentzh> we often run into ko loading failures due to the mempool requires large physically contiguous memory blocks in th mempool when allocating ctl buffers for the channels.

19:18 <fche> we've flipped this way and that w.r.t. kmalloc and vmalloc and stuff over the years

19:18 <fche> how large memory are we talking about ?

19:18 <agentzh> 78KB for a single block.

19:19 <agentzh> at maximum for one of our stp scripts.

19:20 <agentzh> i traced all the kmalloc allocations for the .ko module.

19:20 <Amy1> agentzh: thanks

19:21 <agentzh> any specific reasons for using kmalloc for mempool and task finder? do they involve DMA in any way?

19:22 <agentzh> right now we have to flush page cache or to do a memory compatction to allow the .ko to load successfully.

19:23 <agentzh> it's especially common on machines with small memory or long running servers.

19:24 <fche> we can't take kernel-side soft page faults from probe context, that's one complication

19:24 <agentzh> seems like they were not invoked by probe handlers, but init_module handlers?

19:25 <fche> at reference time I mean

19:25 <fche> ISTR some types of kernel allocated memory still uses an MMU based mapping that's filled via kernel-side page faults

19:25 <agentzh> hmm, but stap maps are already using vmalloc?

19:25 <agentzh> stap maps are surely referenced from within probe handlers already.

19:25 <fche> yeah

19:26 <fche> so which allocation proper are you looking at ?

19:26 <agentzh> vmalloc avoids memory blocks from being swapped out.

19:26 <agentzh> so there won't be any page faults anyway?

19:26 <fche> (kernel can't swap itself anyway)

19:26 <agentzh> a sec

19:27 <agentzh> runtime/mempool.c:60 m = (struct _stp_mem_buffer *)_stp_kmalloc(alloc_size);

19:28 <agentzh> m = (struct _stp_mem_buffer *)_stp_kmalloc(alloc_size);

19:29 <agentzh> and also: runtime/linux/task_finder2.c:1197 vma_cache = _stp_kmalloc(sizeof(struct vma_cache_t) * file_based_vmas);

19:31 <agentzh> and finally: runtime/linux/addr-map.c:242 new_map = _stp_kmalloc(sizeof(*new_map) + sizeof(*new_entry) * (old_size + 1));

19:31 <agentzh> 3 places.

19:32 <agentzh> btw what happens when a soft page fault is triggered in a probe handler? panic? freezes?

19:35 <fche> we disable interrupts routinely while running probe handlers; the kernel is/was unhapy about taking faults at such times

19:37 <agentzh> i see the stap code always uses kzalloc() so the vmalloc'd memory pages are already in the kernel master page tables?

19:37 <agentzh> so no page faults will be possible?

19:37 <fche> my memory is not fresh on the rationale

19:37 <fche> feel free to experiment & report findings !

19:38 <agentzh> okay, i'll do some experiments.

19:38 <agentzh> so what do you mean by "unhappy". we'll just get 0 when a soft page fault is needed?

19:39 <agentzh> sorry, let me rephrase. when the interruptes are disabled, what is the behavior when reading a page requiring a page fault?

19:40 <agentzh> we get an EFAULT erorr code?

19:40 * fche remembers crashes

19:40 <agentzh> ah

19:44 <agentzh> fche: another quick queston, i found that when using kprobes to probe on a kernel function in the .ko module generated by stap, print_backtrace() fails to provide the full kernel backtrace. but calling the kernel function dump_stack() works (though with "?" frame prefixes). is it a known problem?

19:44 <agentzh> wondering if we can simply borrow the code from dump_stack()/

19:45 <agentzh> ?

19:45 <agentzh> the tracee module was compiled with debuginfo and -fno-omit-frame-pointer already.

19:48 <fche> stap --all-modules ...

19:48 <fche> or stap -d tracee-module

19:48 <agentzh> already added both, same thing.

19:48 <agentzh> tested on both fedora 28 and 30 (x86_64).

19:49 <fche> it oughta work in that case

19:49 <fche> (we don't need frame pointers; we use dwarf unwinding)

19:51 <agentzh> i'll add some debugging code inside print_backtrace() then.

19:51 <fche> there is some -D already

19:51 <agentzh> okay

19:52 <agentzh> another thing i've noted is that it seems like when tracee modue is loaded *after* the tracer module, symbols won't get used for probefunc() or print_backtrace().

19:52 <agentzh> but when the tracee is loaded before the tracer, symbols look fine.

19:52 <agentzh> do we need something similar to task finder for dynmaically loaded kernel modules?

19:54 <fche> hm, there is a module_notifier chain that is/was there that we used for this

19:54 <agentzh> interesting. i'll dig it up to see why the symbols are not used.

19:54 <agentzh> thanks

19:56 <agentzh> fche: i've checked the userland memory page causing a read fault in stap scripts, it has read permissions in /proc/PID/maps.

19:56 <agentzh> it's really strange.

19:57 <agentzh> /proc/PID/pagemap also shows that page is present and not swapped and has a page frame number.

19:58 <agentzh> i traced the user_long_error() tapset function being executed, it seems EFAULT was thrown by the kernel's __get_user() API.

19:58 <agentzh> any suggestions on how to trace this further?

19:58 <agentzh> many thanks

19:58 <agentzh> and it happens on bare metal.

19:58 <agentzh> 100% reproducible for that particular user process and userland address.

19:59 <fche> works on other processes? other addresses?

19:59 <agentzh> i've been scratching my head for days on this issue.

19:59 <agentzh> yeah, working on other processes.

19:59 <agentzh> *works

20:00 <agentzh> that memory is not shared and is exclusively mapped for the target process.

20:00 <agentzh> *that memory page

20:00 <agentzh> it's swapbacked.

20:00 <agentzh> but not in swapcache nor is swaped out.

20:01 <agentzh> could this be a hardware issue?

20:01 <agentzh> like a faulty ram stick or DIMM slot?

20:02 <fche> i doubt that'd show this way

20:02 <agentzh> okay

20:04 <agentzh> that box is using 4.15 kernel, but recent.

20:04 <agentzh> *quite recent

20:05 <agentzh> 4.15.18, to be exact.

20:23 Amy1 has quit [Killed (Sigyn (Stay safe off irc))]

21:10 tromey has quit [Quit: ERC (IRC client for Emacs 28.0.50)]

21:57 amerey has quit [Quit: Leaving]