#jruby on 2019-09-10 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

01:40 jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

05:16 _whitelogger has joined #jruby

07:05 rusk has joined #jruby

07:38 _whitelogger has joined #jruby

08:28 shellac has joined #jruby

08:55 drbobbeaty has joined #jruby

09:09 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

10:59 drbobbeaty has joined #jruby

11:49 <kares[m]> considering whether the JIT max compilation limit should be on by default

11:49 <kares[m]> its now 4096 but the limit wasn't used

11:50 <headius[m]> Maybe

11:51 <headius[m]> It would help if we had some idea of what's really getting hot in a large at

11:51 <kares[m]> from a real-world numbers 10.000 compilation cost a 570 allocated meta space (345 used)

11:51 <headius[m]> You would assume the stuff we want to compile is almost all at the beginning

11:51 <headius[m]> Nice

11:52 <kares[m]> so maybe based on that 8096? or let's go higher since memory is cheap? :)

11:52 <kares[m]> (that should fill into a 512M meta)

11:53 <kares[m]> oh right I should share some more since those numbers are not using the defaults

11:53 <kares[m]> but reduced max-size from 2000 -> 1000 and increased threshold 50 -> 100

11:55 <kares[m]> using JIT defaults meta-space is at 618M/370M with 11.500 compilations

12:00 <headius[m]> Wow that's a lot

12:00 <headius[m]> Do we have a count of bytecode size?

12:07 <kares[m]> total? well not sure if I am able to distinguish from normal classes ...

12:09 <headius[m]> I can't remember if I added that metric to JMX but you can check there

12:09 <headius[m]> JVMs have code cache limits, we might as well too

12:10 <headius[m]> Plus some soft LRU maybe

12:16 <kares[m]> oh yeah, I recall seeing some numbers on JMX

12:19 <kares[m]> oh okay, so the above numbers only contained methods (did not include blocks)

12:20 <kares[m]> here's some new totals from the compiler mbean:

12:22 * kares[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/ZaJkNrObFgFogysPrtHDlryV >

12:23 <headius[m]> So meatspace is like 5x bytecode size

12:23 <headius[m]> 164, wot

12:24 <headius[m]> That's probably impractical

12:24 <kares[m]> IRLargestSize 'only' 1263 Avg: 37

12:24 <kares[m]> so reducing max.size might make sense

12:24 <headius[m]> Double check my logic that those numbers are bytecode and not IR size

12:24 <kares[m]> its 2000 now

12:24 <headius[m]> Oh right

12:24 <headius[m]> Ok

12:25 <kares[m]> yeah bytecode

12:25 <headius[m]> Yeah 1000 max seems better

12:25 <kares[m]> we have that on the other machine let's see

12:26 <headius[m]> We need to do another pass over the jit to make sure it's as efficient as possible

12:26 <headius[m]> There may be more things we can push to utility methods

12:27 <kares[m]> machine running with max.size = 1000 generates 10% less code (but it also has threashold = 100)

12:28 * kares[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/aYpcdAxAnIlOchgFUSrdClKE >

12:28 <kares[m]> yeah largest still seems big - so maybe another pass makes sense or re-instantiating a byte-code max check

12:30 <kares[m]> yet, avg is much lower so not sure - we need a median ;)

12:31 <kares[m]> * yet, avg is much lower so not sure - we need median 😺

12:32 <headius[m]> Haha

12:32 <headius[m]> More metrics!

12:33 <kares[m]> obviously threshold 100 does not seem to have a negative effect either ... but than this might be a supernova app compared to others ....

12:34 <headius[m]> Yeah, having a hard static threshold means it will always grow to the same size eventually

12:34 <headius[m]> It just grows slower

12:34 shellac has quit [Quit: Computer has gone to sleep.]

12:35 <kares[m]> exactly

12:36 shellac has joined #jruby

12:36 <kares[m]> somehow pleased no 'enormous' Rails/gems method hit JIT

12:36 <kares[m]> .... the 2000 instruction limit pretty much filtered nothing

12:37 <headius[m]> Yeah those biggest ones are outliers

12:38 <headius[m]> Generated parser code etc

12:38 <kares[m]> Ruby libraries must be getting better - cause there used to be some crazy stuff back the days 😇

12:38 <headius[m]> Giant case/when (which we need to jit more efficiently anyway

12:46 <kares[m]> that's what I had in mind - giant generated case .rb methods but I am not sure what gem that was

12:52 <headius[m]> Taking off for Bangkok, bbiab

12:55 lucasb has joined #jruby

13:34 shellac has quit [Quit: Computer has gone to sleep.]

13:35 shellac has joined #jruby

15:13 <rdubya[m]> my tests don't seem to want to run when I'm trying to test core, this is what I'm running and the output and then it just sits

15:13 * rdubya[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/HUycxFaGokTVGuPkMEBvtqlF >

15:13 <rdubya[m]> been going for over half an hour now

15:14 <enebo[m]> rdubya: that sounds like too long unless you are on a raspi

15:14 <rdubya[m]> lol

15:15 <rdubya[m]> i just tried running it with --verbose on the end and it isn't giving me any more info, is there any other way I can see what it might be hung up on?

15:17 <rdubya[m]> it takes a minute or two before the warnings show up too

15:17 <rdubya[m]> so i'm not sure if it ever even gets into rspec

15:17 <headius[m]> jstack

15:19 <rdubya[m]> let me throw the results of that into a gist, it looks like everything is hung up waiting on a lock

15:20 <rdubya[m]> are these typical? `Fiber thread for block at: uri:classloader:/jruby/kernel/enumerator.rb:62" #27 daemon prio=5 os_prio=31 tid=0x00007fde2018f000 nid=0x9c03 waiting on condition [0x00007000079f0000]`

15:22 <rdubya[m]> not sure if you'd rather have this or a straight dump: https://fastthread.io/ft-thread-report.jsp?dumpId=1&oTxnId_value=0b803cd3-f0b3-4948-b066-2d98aa4bd837

15:23 <headius[m]> Those might be back in the pool?

15:23 <headius[m]> I'm on mobile

15:25 <rdubya[m]> looks like its hung up on spec/ruby/library/socket/udpsocket/send_spec.rb:7

15:25 <headius[m]> Stupid fibers

15:26 <rdubya[m]> yeah there are a bunch of threads that are blocked that are "Fiber thread for block at:" with a bunch of different specs attached to them

15:28 <rdubya[m]> i'll try switching to a previous commit instead of running against master

15:28 <headius[m]> I'm not sure if I'm clearing that thread name properly when they go back in the pool. You could try to fix that so we can see the ones actually in code

15:35 <rdubya[m]> I manage to get it to finish by commenting out the UDPSocket#send specs

15:45 <headius[m]> Oh network oddity on your end maybe?

16:03 xardion has quit [Remote host closed the connection]

16:08 shellac has quit [Ping timeout: 250 seconds]

16:08 xardion has joined #jruby

16:48 <rdubya[m]> sorry got sidetracked, I reverted back to the 9.2.8.0 tag and the problem doesn't happen, but there are a bunch of other failures

17:05 <rdubya[m]> things like:

17:05 * rdubya[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/snuutktpSMDVPLykLhgcEbiE >

17:07 <rdubya[m]> and

17:07 <rdubya[m]> ```

17:07 * rdubya[m] sent a long message: < https://matrix.org/_matrix/media/v1/download/matrix.org/kCxWhORStPYmBrDpFcJZYXsZ >

17:08 <rdubya[m]> the good news is, making RubySymbol return a cached frozen string doesn't break any of the fast tests

17:16 <rdubya[m]> is 9.2.8 supposed to be compatible with mri 2.4 or 2.5?

17:24 <rdubya[m]> lol, on the other hand, our app's specs won't even start to run with frozen strings ☹️

17:28 <rdubya[m]> the first thing we hit is: https://github.com/rails/rails/blob/ae2cbf40449c270feec6fa0aa82b3d5fc2350be7/activesupport/lib/active_support/ordered_options.rb#L43

17:29 <rdubya[m]> which is basically `:name.to_s.chomp!('=')`

17:30 <rdubya[m]> so I'll create the PR but making symbols return frozen strings won't work for rails apps without some work

18:11 <enebo[m]> rdubya: 2.5.3

18:12 <rdubya[m]> cool, for some reason I thought 2.5 support was being skipped

18:17 <enebo[m]> rdubya: 2.6 is being skipped

18:21 <rdubya[m]> ah ok

18:30 <enebo[m]> ok I found an interesting idea for our counters looking at a paper on self recompilation

18:31 <enebo[m]> Their implementation seems to be global optimizer in that every n seconds they seem to evaluate methods to compile (although I am guessing this based on the text)

18:31 <enebo[m]> but what they do is they divide the call counters by a factor to give it a half life

18:32 <enebo[m]> So every period of time t it will divide the individual callsite counter by some value (like 1.2) and see if it passes the counter threshold

18:34 <enebo[m]> our method implementations do this callsite checking per call so the mechanism would have to be adapted.

18:34 <enebo[m]> self == the language

18:35 <enebo[m]> rdubya: so time as a parameter is a function of this design

18:36 <rdubya[m]> sounds like it would be worth exploring

18:36 <enebo[m]> yeah I am just trying to think about how we would adapt this idea

18:37 <enebo[m]> If we stored a nano time when we create the method (or first call) then looked again once we hit the count how would we age that count?

18:37 <enebo[m]> subtract the two values and then use a dividing constant and kick it back out since we would only be at the threshold that that point (then resetting the time counter)

18:38 <enebo[m]> The second time it hits the threshold if it was hot then presumably that factor would be less

18:38 <enebo[m]> Just talking through this I feel like the counter for checking the JIT wuold be one value and the actual value for performing it would be a second valud which was less than the first one

18:39 <enebo[m]> substract the two values == time(start) & time(threshold1)

18:39 <enebo[m]> those two subtracted would give some aging divider(multiplier) which would give a new counter value

18:40 <enebo[m]> That new counter value would compare to threshold2 which is < threshold1

18:40 <enebo[m]> If it is greater than 2 it JITs otherwise the new counter value is written and the timestamp is replaced

18:42 <enebo[m]> The additional cost of this new heuristic would be one nanotime call per firs JIT threshold and an extra field for it + the simple math of reducing the counter when that first threshold is hit. Seems pretty low

18:43 <enebo[m]> rdubya: I think you are only one reading atm based on riot web UI and you may not be fully up to date on how we JIT

18:43 <enebo[m]> past just using a counter

18:44 <rdubya[m]> yeah i'm trying to follow along but I don't know much about those internals, my JIT experience is mostly high level reading on the java JIT

18:44 <enebo[m]> here I will show you a class and explain this since I think you may be interested

18:45 <rdubya[m]> would that be similar to what I mentioned the other day, where the JIT could keep a "last ran" timestamp and then clear out the counters at the end of each run, that might be way too much overhead lol

18:45 <rdubya[m]> i guess that is probably too short of a window

18:46 <enebo[m]> well it is similar to be sure

18:46 <enebo[m]> This heuristic is about half life of counter values

18:46 <enebo[m]> so if you encounter it reasonably quickly it may deside to just reduce the counter by 20% or 50% vs wipe them

18:47 <rdubya[m]> ah ok

18:47 <rdubya[m]> that makes sense

18:47 <enebo[m]> wiping them would be half life (bad word) of 100%

18:47 <enebo[m]> So a good question is why did they decay the counter vs nuking it

18:47 <enebo[m]> nuking is much simpler

18:48 <rdubya[m]> decaying would probably catch ones that are used sporadically more effectively

18:48 <enebo[m]> I also have had problems with timing because a good time on one piece of hardware feels like a poor one on another

18:48 <enebo[m]> decaying could also allow for better tuning

18:49 <enebo[m]> well for the sporadic aspect of something which is being called occasionally but is sometimes hot

18:49 <enebo[m]> Although if it is really hot then nuking should still be enough to put it over the edge

18:50 subbu is now known as subbu|lunch

18:51 <enebo[m]> ok I think I can see a reason for their decay

18:51 <enebo[m]> if they run every 4 seconds and then can evaluate all candidate sites to compile the actual counter may potentially take longer than 4 seconds (let's pretend 30s)

18:51 <rdubya[m]> decaying probably also lets you still compile the methods eventually, but not immediately after they've hit the limit, it kind of forces them to prove that they are used frequently enough lol

18:52 <enebo[m]> So the decaying would happen more often than the promotion to compiled versions

18:52 <enebo[m]> yeah for sure that must be the intent

18:53 <enebo[m]> You are being called enough to be interesting but not neccesarily enough to be reasonably compiled

18:53 <enebo[m]> Using nuking would just get rid of cold compiles

18:53 <enebo[m]> cold compiles is our main issue though

18:55 <enebo[m]> Another main difference in their system and ours is we look for threshold at each method as it is called so we cannot grow a counter size greater than threshold

18:56 <enebo[m]> wonders if subbu has much experience in how they implemented this

18:56 <enebo[m]> or remembers :)

18:56 <rdubya[m]> maybe you could use a decaying decay to address the cold compiles? (if I'm understanding that right) so when the server starts it decays 90% then after a minute it drops to 80%, etc

18:57 <rdubya[m]> or maybe the reverse

18:57 <enebo[m]> well I am thinking more about your nuking counter idea more atm since it is less details (a subset)

18:57 <enebo[m]> what I don't like in either is specifying an explicit time value

18:58 <rdubya[m]> yeah, would be good if that were tunable with a decent default

18:58 <enebo[m]> but I don't know how you would possibly generate one on the fly...a calibration of some sort maybe but a first start would be to just have a settable value

18:59 <enebo[m]> If we can self-tune good behavior from it and not reduce perf but reduce number of native methods it would be a good first step

19:00 <enebo[m]> I was talking to my wife about this at lunch and she suggested another thing we have not really discussed very much which would be aging out compiled methods back to interp

19:01 <rdubya[m]> that could help with the memory

19:01 <rdubya[m]> would that cause a cycle of recompiling if there wasn't enough memory?

19:02 <enebo[m]> I look at memory as the main issue and believe that most of our native compiles are not actually useful for performance

19:02 <enebo[m]> I believe the heuristics cannot be a simple counter even with aging out methods as you say it will just compile again later unless we mark it as 'you're dead to me"

19:03 <enebo[m]> I don't think doing marking like that is an end-goal solution since some methods do become important in different life-cycles of a program (or at least theoretically they could)

19:04 <enebo[m]> but aging out methods is good for killing off early hot methods that are no longer needed

19:04 <enebo[m]> so I think that is a good idea as a companion to the topic

19:04 <rdubya[m]> yeah, makes sense

19:06 <rdubya[m]> would it be possible to cap it by memory, i.e. before we jit something we check to see how much metaspace is left and then either evict something or skip doing the compile?

19:07 <enebo[m]> metaspace usage for me is our largest contemporary problem. The fact that you happen to have a larger app is a good opportunity for us (and I guess for you too :) )

19:07 <enebo[m]> I don't know. I think we can look up metaspace although those stats are lies

19:07 <rdubya[m]> we have 2 different reasons for wanting the cap too 🙂

19:07 <enebo[m]> or I think they are heavily underreported because malloc is allocing much more than is really being used....maybe the JVM is reporting that right?

19:08 <rdubya[m]> for our clusters we want to not have it grow indefinetly

19:08 <enebo[m]> well it won't technically :P

19:08 <rdubya[m]> for our containers we are trying to make the containers handle "x" traffic and if we go over that we spin up another container

19:08 <enebo[m]> It will just grow for so long it will seem like it is growing forever

19:08 <enebo[m]> unless you are generating source code as time moves on forever

19:08 <rdubya[m]> lol

19:09 <enebo[m]> but from a practical standpoint you are right

19:09 <enebo[m]> It is an unimportant distinction

19:10 <enebo[m]> rdubya: I do believe we have same exact goal. I want process memory to be as favorable as possible against MRI and that cannot include some endless tail of infrequent compiles

19:10 <enebo[m]> or "appearing to be endless"

19:36 subbu|lunch is now known as subbu

20:20 <enebo[m]> https://github.com/jruby/jruby/tree/counter_nuke

20:21 <enebo[m]> kares: rdubya This has a tunable -Xjit.time.delta=some_nanosecond_delta which I am hoping you can play with. The default value is I think too small but I am curious to see if it kills the method growth altogether

20:25 <kares[m]> okay - thanks. looks reasonable but we haven't yet setup using snapshots

20:27 <kares[m]> in the mean time I pushed jit.max with some cleanups as e.g. excludes weren't excluding blocks

20:31 <rdubya[m]> cool

20:32 <enebo[m]> kares: yeah if this strat works out I don't think a .max will be needed either

20:50 <kares[m]> enebo: but did you really mean just 0.1millis delta?

20:53 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

20:55 <kares[m]> and if a method takes longer then a delta (say 100ms) than it will never JIT

20:55 <kares[m]> which should be good in theory - yeah need to give it a try ...

20:55 <kares[m]> this might end up more challenging than I hoped if I end up comparing jit logs 😄

21:01 travis-ci has joined #jruby

21:01 travis-ci has left #jruby [#jruby]

21:01 <travis-ci> jruby/jruby (master:b2f2018 by kares): The build was broken. https://travis-ci.org/jruby/jruby/builds/583341525 [208 min 12 sec]

21:18 travis-ci has joined #jruby

21:18 <travis-ci> jruby/jruby (counter_nuke:24632b3 by Thomas E. Enebo): The build failed. https://travis-ci.org/jruby/jruby/builds/583343121 [192 min 57 sec]

21:18 travis-ci has left #jruby [#jruby]

21:27 <enebo[m]> kares: that was the first value I tried and startup stuff JITted much less but did not see any real change

21:28 <enebo[m]> for perf stuff we may end raising this quite a bit but we are mostly looking to cull the cold methods which are not called much

21:29 <enebo[m]> in doing some reading I think larger methods with loops should get more consideration but that will be a later experiment (plus we have no OSR so it may not end up being as useful)

22:13 jrafanie has joined #jruby

23:05 lucasb has quit [Quit: Connection closed for inactivity]

23:10 jrafanie has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]