#systemtap on 2021-04-07 — irc logs at freenode.irclog.whitequark.org

2015-11-12 23:18 fche changed the topic of #systemtap to: http://sourceware.org/systemtap; email systemtap@sourceware.org if answers here not timely, conversations may be logged

00:42 khaled has quit [Quit: Konversation terminated!]

00:50 orivej has quit [Ping timeout: 265 seconds]

01:19 hpt has joined #systemtap

01:54 sscox has quit [Ping timeout: 240 seconds]

02:09 orivej has joined #systemtap

02:14 orivej has quit [Ping timeout: 240 seconds]

02:46 amerey has quit [Remote host closed the connection]

03:34 orivej has joined #systemtap

06:07 orivej has quit [Ping timeout: 240 seconds]

07:02 khaled has joined #systemtap

08:51 ggherdov has quit [Ping timeout: 248 seconds]

08:54 ggherdov has joined #systemtap

09:25 fdalleau_away is now known as fdalleau

11:16 mjw has joined #systemtap

12:29 orivej has joined #systemtap

13:04 hpt has quit [Ping timeout: 246 seconds]

13:14 orivej has quit [Ping timeout: 240 seconds]

14:04 amerey has joined #systemtap

15:02 tromey has joined #systemtap

16:23 mjw has quit [Quit: Leaving]

18:51 fdalleau is now known as fdalleau_away

18:55 fdalleau_away is now known as fdalleau

20:03 tromey has quit [Quit: ERC (IRC client for Emacs 27.1)]

20:12 mjw has joined #systemtap

20:19 fdalleau is now known as fdalleau_away

20:33 orivej has joined #systemtap

21:18 orivej has quit [Ping timeout: 240 seconds]

21:19 orivej has joined #systemtap

22:05 orivej has quit [Ping timeout: 268 seconds]

22:38 <kerneltoast> fche, baddish news

22:39 <fche> oh no

22:39 <kerneltoast> the backtrace bug's cause is different on newer vs older kernels

22:39 <kerneltoast> on newer kernels where it works half the time, kernel_read_file_from_path() is not returning an error

22:39 <kerneltoast> on older kernels where it works none of the time, kernel_read_file_from_path() returns errors

22:40 <kerneltoast> exciting isn't it

22:42 <kerneltoast> the issue on newer kernels seems to be a race in stap

22:42 <kerneltoast> i replaced kernel_read_file_from_path's vmalloc with a stack allocation and the bug "went away"

22:42 <kerneltoast> (i replaced it by just passing in a pointer to a stack buffer)

22:44 <kerneltoast> fche, try this: https://paste.centos.org/view/ee283a99

22:45 <kerneltoast> it makes the bug disappearTM

22:46 <kerneltoast> but only on your fancy 5.11 kernel

22:47 <kerneltoast> i guess when kernel_read_file_from_path() was changed in the kernel to take an offset, it stopped failing

23:06 amerey has quit [Remote host closed the connection]

23:12 <kerneltoast> fche, yo

23:12 <kerneltoast> i have an idea to fix the 4.18 bug

23:12 <kerneltoast> (centos 8)

23:25 <kerneltoast> fche, https://gist.github.com/kerneltoast/1d6cc330b665a18da789175cfc4af27d

23:26 <fche> loo king

23:27 <kerneltoast> the only danger is potentially populating our section addresses with garbage

23:27 <fche> um so kernel_read_file .... doesn't like heap pointers? neato

23:27 <kerneltoast> oh whoops i forgot i was writing stap

23:27 <fche> don't see the danger

23:28 <kerneltoast> stack arrays evil

23:28 <kerneltoast> that can be a heap alloc no problemo

23:29 <kerneltoast> > don't see the danger

23:29 <kerneltoast> well if kernel_read_file returns a legitimate error and our buffer happens to have some valid data in it, there could be a problem

23:30 <kerneltoast> what happens if we read the .eh_frame address as 0x9

23:31 <kerneltoast> and then the read failed after that

23:31 <fche> in case of error, well, don't use any of the data?

23:31 <kerneltoast> the problem is that it returns a spurious error on success

23:31 <kerneltoast> so its errors are garbage

23:32 <kerneltoast> we could parse the spurious error (-EIO) but i'm sure lots of the deeper fs machinery returns -EIO

23:33 <fche> ok I would prefer to find out the cause of this spurious error if it really is spurious

23:33 <kerneltoast> sure, i can show you

23:33 <kerneltoast> and i can show you why 5.11 dodges it

23:34 <kerneltoast> the i_size here is garbage: https://elixir.bootlin.com/linux/v4.18.20/source/fs/exec.c#L910

23:34 <kerneltoast> it must be a very large number, because it's bigger than 64. with stap master, kernel_read_file returns -EFBIG from there

23:35 <kerneltoast> we can get past that error by setting max_size to 0

23:35 <kerneltoast> but now we're left with this error: https://elixir.bootlin.com/linux/v4.18.20/source/fs/exec.c#L939

23:35 <kerneltoast> pos never equals i_size because i_size is a lie

23:38 <fche> kernel_read_file() reads like it wants to read an Entire file

23:38 <kerneltoast> yes, and on 5.11 it was changed to accommodate partial reads

23:38 <kerneltoast> err the change wasn't in 5.11

23:39 <kerneltoast> idk when it was, but you know what i mean

23:39 <kerneltoast> on 5.11, i printed out i_size and it appears to be correct. but even if it weren't correct, stap's usage of kernel_read_file dodges that pesky i_size check entirely

23:39 <kerneltoast> (for when i_size is bigger than 64)

23:40 <kerneltoast> observe: https://elixir.bootlin.com/linux/v5.11.12/source/fs/kernel_read_file.c#L71

23:40 <kerneltoast> and then here: https://elixir.bootlin.com/linux/v5.11.12/source/fs/kernel_read_file.c#L104

23:40 <kerneltoast> so i_size can be 0xbologna on 5.11 and it won't cause us any issues

23:43 <kerneltoast> i have a feeling i_size is INT_MAX or something, since the maximum size of the sections sysfs nodes are not defined when they're created. see this in 5.11: https://elixir.bootlin.com/linux/v5.11.12/source/kernel/module.c#L1658

23:43 <kerneltoast> on 64-bit, MODULE_SECT_READ_SIZE == 19

23:43 <kerneltoast> 4.18 doesn't have that

23:44 <fche> how trash is i_size on your kernels?

23:44 <fche> on some random rhel7 one, i_size = 4096 for those files, which is trash but not Super Awful trash

23:44 <fche> on 5.11, i_size appears to be a nice small exactish number

23:45 <kerneltoast> i can only see what i_size is if i can get past that pos != i_size check

23:45 <kerneltoast> i'll try 4096 and see if it succeeds

23:45 <fche> stat /sys/module/FOO/section/BAR

23:46 <kerneltoast> 4096

23:47 <kerneltoast> heh i had done that on 5.11 to find what i_size was, but it didn't occur to me to try it on centos8 for some reason

23:47 <kerneltoast> i wonder if it's just PAGE_SIZE

23:48 <kerneltoast> either way, it may as well be Super Awful trash because it breaks that pos != i_size check

23:52 <kerneltoast> if we can get the range of the module address space then we could use it to validate our read

23:53 <kerneltoast> or we could find some other way to read that data

23:53 <kerneltoast> barring a straight up read() syscall...

23:54 <fche> we can read into a PAGE_SIZE buffer, and then this should work on old and new, methinks

23:55 <kerneltoast> that only fixes the -EFBIG check

23:55 <kerneltoast> there aren't 4096 bytes of data to read

23:56 <kerneltoast> `pos` will only ever go up to 19

23:57 <fche> you're thinking about the pos != i_size check?

23:57 <fche> that should be fine too

23:57 <fche> it's not a pos != maxsize

23:57 <kerneltoast> i_size == 4096

23:57 <kerneltoast> pos == 19

23:58 <kerneltoast> maybe 19 == 4096 in canada, but not in the rest of the world

23:58 <fche> good point